This next series of posts are my notes on articles provided by Dr. Newman in his Regression workshop at Roundtable 2008 at Andrews University, summer of 2008.
Newman, I., & Newman, C. (2000). A discussion of low r-squares: Concerns and uses. Educational research quarterly, 24(2), 3-9.
This article is available in the OCLC Article First database.
The point of this article is to suggest that low R-squares shouldn’t be thrown out without consideration. They may have value under certain circumstances, so should be considered carefully.
What is an R Square?
So first let’s remind ourselves what an R squared is. Wikipedia has a very detailed overview if you want to read that. Basically when you’re using linear regression, the R squared is the explained variance. It is the percent of variance that can be explained by the variables you’re examining. I.e. why do some people get a higher or lower score on your measurement than others? Your variables may explain some of that variance in scores.
Why are R Squares low?
The article suggests some reasons why R Squares can be low.
- They can be low (and are appropriately low) in the early stages of research. i.e. not enough research has been done to identify all the variables that would account for the variance.
- In social sciences, the predictor variables tend to have small effects.
- There might be some measurement error. It is very difficult in social sciences to measure a construct such as intelligence, attitude, etc. So it’s pretty common to have some measurement error. This is where the reliability and validity scores come into the picture.
How do you know your research is any good?
From what I’ve learned about stats so far, there are a few ways we can look at our data to see what it tells us and if the results are useful.
- Tests of significance tells is if the effect happened by chance or not.
- Effect size is another important measurement, which used to not be reported, but really is a critical piece of information to help others interpret your results.
- Replicability is also important. In fact, Dr. Newman suggests that a measure of replicability is more useful that mere significance. Maybe it’s significant with this set of respondents, but does it hold up with another set?
Under what circumstances is a low R square “ok”?
The article suggests some examples and things to consider when looking at low R squares. There are several examples in the article of places where a low effect size or R square is still helpful.
- A drug that explains less than 1% of the variance may still impact 60,000 lives in a population of 1 million.
- The odds ratio at casinos may be just slightly in advantage for the house over players, but that adds up to billions of dollars over time.
- When looking at groups of people vs. individuals, the smaller R squared still has value.
- If the small R square is consistent and replicable, it still has value.
- A low R square may not necessarily be a wrong path (as suggested by McNeil quoted in the article), it may only be a partial explanation of the variance and further research will improve it by adding additional predictor variables.
- It may be better to have a smaller R square that is replicable vs. a higher R square that isn’t replicable.
In summary, the point seems to be that it’s ok early on in research to have a smaller R square when the goal is to hopefully eventually get to a larger R square. This article seems really useful to use in interpreting research that results in a low R square!