Defining the Dependent Variable
In the Research Methods essay for Chapter 1, we discussed the importance of testable hypotheses—that is, hypotheses that are framed in a way that makes it clear what evidence will confirm them and what evidence will not. Sometimes, though, it’s not obvious how to phrase a hypothesis in testable terms. For example, in Chapter 12 we discuss research on creativity, and within this research investigators often present hypotheses about the factors that might foster creativity or might undermine it. Thus, one hypothesis might be: “When working on a problem, an interruption (to allow incubation) promotes creativity.” To test this hypothesis, we would have to specify what counts as an interruption (5 minutes of working on something else? an hour?). But then we’d also need some way to measure creativity; otherwise, we couldn’t tell if the interruption was beneficial or not.
For this hypothesis, creativity is the dependent variable—that is, the measurement that, according to our hypothesis, might “depend on” the thing being manipulated. The presence or absence of an interruption would be the independent variable—the factor that, according to our hypothesis, influences the dependent variable.
In many studies, it’s easy to assess the dependent variable. For example, consider this hypothesis: “Context reinstatement improves memory accuracy.” Here the dependent variable is accuracy, and this is simple to check—for example, by counting up the number of correct answers on a memory test. In this way, we would easily know whether a result confirmed the hypothesis or not. Likewise, consider this hypothesis: “Implicit memories can speed up performance on a lexical decision task.” Here the dependent variable is response time; again, it is simple to measure, allowing a straightforward test of the hypothesis.
The situation is different, though, for our hypothesis about interruptions and creativity. In this case, people might disagree about whether a particular problem solution (or poem, or painting, or argument) is creative. This will make it difficult to test our hypothesis.
Psychologists generally solve this problem by recruiting a panel of judges to assess the dependent variable. In our example, the judges would review each participant’s response and evaluate how creative it was, perhaps on a scale from 1 to 5. By using a panel of judges rather than just one, we can check directly on whether different judges have different ideas about what creativity is. More specifically, we can calculate the inter-rater reliability among the judges—the degree to which they agree with each other in their assessments. If they disagree with each other, it would appear that the assessment of creativity really is a subjective matter and cannot be a basis for testing hypotheses. In that case, scientific research on this issue may not be possible. But if the judges do agree to a reasonable extent—if the inter-rater reliability is high—then we can be confident that their assessments are neither arbitrary nor idiosyncratic.
Let’s be clear, though, that this is a measure of reliability—that is, a measure of how consistent our measurements are. As the text describes, reliability is separate from validity (i.e., whether we’ve succeeded in measuring what we intended to measure). It’s possible, for example, that all of our judges are reacting to, say, whether they find the responses humorous or not. If the judges all have similar senses of humor, they might agree with each other in this assessment (and so would have a high level of inter-rater reliability), but, even so, they would be judging humor, not creativity (and so would not offer valid assessments). On this basis, measures of inter-rater reliability are an important step toward establishing our measure—but we still need other steps (perhaps what the chapter calls a “predictive validation”) before we’re done.
Notice, in addition, that this way of proceeding doesn’t require us to start out with a precise definition of creativity. Of course, a definition would be very useful because (among other benefits) it would allow us to give the judges on our panel relatively specific instructions. Even without a definition, though, we can just ask the judges to rely on their own sense of what’s creative. This isn’t ideal; we’d prefer to get beyond this intuitive notion. But having a systematic, nonidiosyncratic consensus measurement at least allows our research to get off the ground.
In the same way, consider this hypothesis: “College education improves the quality of critical thinking.” This hypothesis—and many others as well—again involves a complex dependent variable, and might also require a panel of judges to obtain measurements we can take seriously. But by using these panels, we can measure things that seem at the outset to be unmeasurable, and in that way we appreciably broaden the range of hypotheses we can test.
Critical Questions
1. Why would defining a dependent variable be difficult in a study of creativity? |
|
2. fiogf49gjkf0d What is inter-rater reliability? For what kinds of measurements would it be necessary to measure inter-rater reliability? |
|
3. fiogf49gjkf0d Select a problem-solving study that was covered in your text. What were the independent variables and dependent variables in this study? |
|
Submit to Gradebook:
Correlations
Often in psychology, data are analyzed in terms of correlations, and this is certainly true in the study of intelligence. We say that intelligence tests are reliable, for example, because measures of intelligence taken, say, when people are 6 years old are correlated with measures taken from the same people a dozen years later. Likewise, we say that intelligence tests have some validity because measures of intelligence are correlated with other performance measures (grades in school, or some assessment of performance in the workplace). Or, as one more example, we conclude that g exists because we can observe correlations among the various parts of the IQ test—and so someone’s score on a verbal test is correlated with their score on a spatial test.
But what does any of this mean? What is a correlation? The calculation of a correlation begins with a list of pairs: someone’s IQ score at, say, age 6 and then the same person’s score at age 18; the next person’s scores at age 6 and age 18; the same for the next person and the next after that. A correlation examines these pairs and asks, roughly, how well we can predict the second value within each pair once we know the first value. If we know your IQ score at age 6, how confident can we be in our prediction of your score a dozen years later?
Correlation values—usually abbreviated with the letter r—can fall between +1.0 (a so-called “perfect correlation”) and –1.0 (a perfect inverse correlation). Thus the strongest possible correlation is either +1.0 or –1.0. The weakest possible correlation is zero—indicating no relationship. As some concrete examples, the correlation between your height, measured in inches, and your height, measured in centimeters, is +1.0 (because these two measurements are obviously assessing the exact same thing). The correlation between your current distance from the North Pole and your current distance from the South Pole is –1.0 (because each mile you move closer to one pole necessarily takes you one mile away from the other pole). The correlation between your height and your IQ, in contrast, is zero: There’s no indication that taller people differ in their intelligence from shorter people.
Most of the r values you’ll encounter in psychology, though, are more moderate. For example, the chapter mentions a correlation of roughly r = +.50 between someone’s IQ and their GPA in college; the correlation between someone’s IQ score and the score of their (nontwin) brother or sister is about r = +.60. What do these values mean? The full answer is complicated, but here’s an approximation.
Researchers routinely report r values, but the really useful statistic is r2—that is, r × r. Bear in mind here that (as we noted early on) correlations are based on pairs of observations, and the r2 value literally tells you how much of the overall variation in one measure within the pair can be predicted, based on the other measure in the pair. Thus, let’s look at the correlation between IQ and school performance (measured in GPA). The correlation is +.50, and so r2 is +.25. This means that 25% of the variation in GPA is predictable, if you know students’ IQ scores. The remaining 75% of the variation, it seems, has to be explained in other terms.
A different way of thinking about these points hinges on the “reduction of uncertainty.” To assess this reduction, you might compare how good your prediction of someone’s school performance will be if you know the person’s IQ, and how good your prediction would be if you didn’t know their IQ. Equivalently, you can compare how uncertain you were in your predictions initially, and how much less uncertain you would be, once you know the person’s IQ score.
But what does “uncertainty” mean in this context? Once again, let’s be clear that correlations allow you to make predictions: If you know someone’s IQ at age 6, a correlation allows you to predict the person’s score at age 18. But predictions, in turn, have two elements: You might predict that the 18-year-old’s IQ will be, let’s say, 110, but you’ll also want to express your degree of uncertainty, so you might say “110 plus-or-minus 8,” or “plus-or-minus 5%,” or something like that. That “plus-or-minus” clause reflects the unexplained variation, and, as correlations grow stronger, your predictions become more precise, and the plus-or-minus bit gets smaller. (If you have a math background, the notion here is that, with stronger correlations, the individual observations are more tightly clustered around the regression line.) These details aren’t crucial, though. What is crucial is the idea that correlations allow predictions, and that stronger correlations allow more precise correlations, and that an r2 value tells us how much of the data pattern we now have explained.
Finally, let’s step away from the nature of correlations in general and consider one last point about correlations in psychology: In psychological research, we find only modest correlations. We’ve said, for example, that the correlation between IQ scores and academic performance is roughly +.50, and, we’ve now said, this means that 25% of the variation from student to student is predictable based on IQ, and the remaining 75% of the variation needs to be explained in other terms. Thus, IQ is a major contributor to performance, but even so, a very large amount of the observed variation—the differences between an A student and a C student, and so on—is produced by the effect of other variables, separate from IQ. Some of these other variables, on their own, matter a lot (including the amount of studying, or choice of strategy in studying). Other variables contribute only a little to the overall pattern, but there are many of these variables, and so, in combination, they too have an impact on the 75% of the variation not accounted for by the IQ score. All of this is, again, a way of saying that IQ is a major factor in determining life outcomes, but a long list of other factors also play a role. This is one of the reasons that, as we’ve repeatedly said, IQ scores do not shape your destiny.
Critical Questions
1. fiogf49gjkf0d Is a correlation of +.75 stronger or weaker than a correlation of –.90? Explain your answer, and make sure to include a discussion of the components of a correlation: strength and direction. |
|
2. fiogf49gjkf0d If you read that the correlation between height and weight is .60, what does that mean? |
|
3. fiogf49gjkf0d Your sister did not score very well on her IQ test, and is now worried that she won’t succeed in college. What would you tell her? Remember: the correlation between IQ and academic performance is r = .50. |
|
Submit to Gradebook: