Measurement: Aging and the Psychology of Self-Report
University of Michigan
Social psychological research includes the measurement of a wide range of phenomena, from person perception and stereotyping to political attitudes and consumer preferences, and from perceptions of social change to well-being and social support. Social psychology’s key contribution to measurement is not the development of particular measures for specific content domains, but basic research into the cognitive and communicative processes underlying self-reports, with implications that cut across many content domains. For comprehensive reviews of the psychology of self-reports see Schwarz (1996, 1999b); Sudman, Bradburn, and Schwarz (1996); Tourangeau, Rips, and Rasinski (2000); and the contributions in Sirken et al. (1999). As will become apparent, the cognitive changes associated with normal human aging can profoundly affect the processes underlying self-reports of opinions and behaviors, giving rise to age-sensitive context effects (see the contributions in Schwarz, Park, Knäuper, and Sudman, 1999). As a result, any observed age difference in self-reported attitudes and behaviors can reflect (a) a true difference, (b) a difference in the response process, or (c) an unknown mixture of both. In this paper I illustrate the problem by highlighting three age-sensitive context effects that are sufficiently pronounced to thwart meaningful cohort comparisons: question order effects, response order effects, and the effects of alternative responses.
Social psychologists and public opinion researchers have long been aware that attitude reports are highly context dependent (for reviews see Schuman and Presser, 1981; Sudman et al., 1996). When asked to report on their opinions, respondents can rarely retrieve a ready-for-use answer from memory. Instead, they need to form an answer on the spot, drawing on information that is accessible at that point in time. This gives rise to question order and response order effects, both of which are age sensitive.
Question Order Effects Decrease with Age
Question order effects emerge when preceding questions bring information to mind that respondents may otherwise not consider in answering a subsequent question (see Sudman et al., 1996, for a discussion of the underlying processes). To reduce these effects, researchers often introduce buffer items to separate related questions, thus attenuating the accessibility of previously used information. The same logic suggests that age-related declines in memory function attenuate question order effects because they render it less likely that previously used information is still accessible. The available data support this prediction (Knäuper, Schwarz, Park, and Fritsch, 2003, unpublished manuscript; Schwarz and Knäuper, 2000).
One of the most robust question order effects in the survey literature was observed by Schuman and Presser (1981), who asked respondents if a pregnant woman should be able to obtain a legal abortion “if she is married and does not want any more children” (Question A) or “if there is a strong chance of a serious defect in the baby” (Question B). Not surprisingly, more respondents support legal abortion in response to Question B than in response to Question A. More important, support for Question A decreases when Question B is presented first: Compared to the risk of a serious birth defect, merely “not wanting any more children” appears less legitimate, reducing support for legal abortion. Secondary analyses show that this question order effect results in a difference of 19.5 percentage points for younger respondents, but decreases with age and is no longer observed for respondents aged 65 and older (Figure 1). Thus researchers arrive at different conclusions about cohort differences depending on the order in which the questions are asked. When the “no more children” question (A) is asked first, support for abortion decreases with age, suggesting that older respondents hold more conservative attitudes toward abortion. Yet no age difference can be observed when the same question is preceded by the “birth defect” question (B).
Laboratory experiments (Knäuper et al., 2003) indicate that the attenuation of question order effects among older respondents is due to age-
related declines in working memory capacity. Using Salthouse and Babcock’s (1991) reading span test, Knäuper and colleagues (2003) found that younger respondents (age 20 to 40 years), as well as older respondents (age 60 to 100 years) with good working memory capacity, showed the familiar question order effect, whereas older respondents with poor working memory capacity did not (Figure 2), suggesting that the attenuation of question order effects among older respondents is due to age-related declines in working memory capacity. A related experiment with different questions replicated this pattern.
However, the observed attenuation of question order effects may be limited to the relatively uninvolving questions typical for public opinion surveys. In contrast, older adults’ inhibition problems (Hasher and Zacks, 1988) may exaggerate question order effects when the question is of high personal relevance or emotionally involving. This possibility remains to be tested.
Response Order Effects Increase with Age
A second major source of context effects in attitude measurement is the order in which response alternatives are presented. Response order effects are most reliably obtained when a question presents several plausible response options (see Schwarz, Hippler, and Noelle-Neumann, 1992; Sudman
et al., 1996, Chapter 6, for detailed discussions). To understand the underlying processes, suppose you were asked to provide a few reasons why “divorce should be easier to get.” You could easily do so, yet you could just as easily provide some reasons why “divorce should be more difficult to get.” When such alternatives are juxtaposed (as in “Should divorce be easier to get or more difficult to get?”), the outcome depends on which alternative is considered first. While the researcher hopes that respondents (a) hold the question and all response alternatives in mind, (b) evaluate the implications of each alternative, and (c) finally select the one that they find most agreeable, respondents may not do so. Instead, respondents who first think about “easier” may come up with a good supporting reason and may endorse this answer without engaging in the additional steps.
Importantly, the likelihood that respondents elaborate on a given alternative depends on the order and mode in which the response options are presented (Krosnick and Alwin, 1987). When presented in writing, respondents elaborate on the implications of the response options in the order presented. In this mode, an alternative that elicits supporting thoughts is more likely to be endorsed when presented early rather than late on the list, giving rise to primacy effects. In contrast, when the alternatives are read to respondents, their opportunity to think about the early ones is limited by
the need to listen to the later ones. In this case, they are more likely to work backward, thinking first about the last alternative read to them. When this alternative elicits supporting thoughts, it is likely to be endorsed, giving rise to recency effects. Thus, a given alternative is more likely to be endorsed when presented early rather than late in a visual format (primacy effect), but when presented late rather than early in an auditory format (recency effect).
On theoretical grounds, we may expect that older respondents find it more difficult to keep several response alternatives in mind, elaborating on their respective implications to select the most appropriate answer. This should be particularly true in telephone interviews, where the alternatives are read to respondents without external memory support. The available data strongly support this prediction (see Knäuper, 1999, for a comprehensive review and meta-analysis). For example, Schuman and Presser (1981) asked respondents in a telephone interview, “Should divorce in this country be easier to obtain, more difficult to obtain, or stay as it is now?” Depending on conditions, the response alternative “more difficult” was read to respondents as the second or as the last alternative. Overall, respondents were somewhat more likely to select the response alternative “more difficult” when presented last, a recency effect. However, secondary analyses reported by Knäuper (1999) indicate a dramatic age difference: As shown in Figure 3, the size of the recency effect increased with respondents’ age, ranging from a nonsignificant 5 percent for age 54 and younger to 36.4 percent for age 70 and older. Thus different substantive conclusions will be drawn about the relationship of age and attitudes toward divorce, depending on the order in which the response alternatives are presented: whereas attitudes toward divorce seem to become much more conservative with age under one order condition, no reliable age differences are obtained under the other order condition.
The available data suggest that age differences in response order effects are limited to the more taxing auditory format and are not observed when all response alternatives are presented in writing and remain visible. As an example, consider a study in which younger (age 20 to 40) and older (age 65+) adults were asked, “Which of the following four cities do you find most attractive?” (Knäuper et al., 2003). Washington, DC, was presented as either the first or last choice (Table 1).
When the response alternatives were read to respondents, younger as well as older adults were more likely to choose Washington, DC, when it was presented last rather than first. Replicating the pattern of the divorce findings, this recency effect was more pronounced for older than younger respondents, with differences of 24 versus 8 percentage points. When the response options were presented in writing, however, older as well as younger respondents were more likely to choose Washington, DC, when it
TABLE 1 Age, Mode, and Response Order Effects
was presented first rather than last. Importantly, this primacy effect under a visual mode was of comparable size for both age groups.
In sum, the reviewed findings suggest that question order effects are likely to decrease with age, whereas response order effects are likely to increase with age. Both of these effects can be traced to age-related declines in cognitive resources. Hence, self-reports of attitudes are not only context dependent, but the size of the emerging context effects is itself age sensitive, rendering comparisons across age groups fraught with uncertainty. In addition, several studies indicate that older respondents are more likely to provide a “don’t know” or “no opinion” answer (for a review see Schwarz and Knäuper, 2000). The processes underlying the latter observation are still unknown.
BEHAVIORAL FREQUENCY REPORTS
Many questions about respondents’ behavior are frequency questions. Except for rare and very important behaviors, respondents are unlikely to have detailed representations of numerous individual instances of a behavior stored in memory. Rather, the details of various instances of closely related behaviors blend into one global representation (Linton, 1982; Neisser, 1986). Thus, many individual episodes become indistinguishable or irretrievable, due to interference from other similar instances (Baddeley and Hitch, 1977; Wagenaar, 1986), leading to knowledge-like representations that “lack specific time or location indicators” (Strube, 1987, p. 89). As a result, respondents usually have to resort to estimation strategies to arrive at a behavioral frequency report. See Bradburn, Rips, and Shevell (1987); Schwarz (1990); and Sudman et al. (1996) for extensive reviews, and the contributions in Schwarz and Sudman (1994) for research examples.
A particularly important estimation strategy is based on subjective theories of stability and change (see Ross, 1989, for a review). In answering retrospective questions, respondents often use their current behavior or opinion as a benchmark and invoke an implicit theory of self to assess whether their past behavior or opinion was similar to, or different from, their present behavior or opinion. Assuming, for example, that one’s political beliefs become more conservative over the life span, older adults may infer that they held more liberal political attitudes as teenagers than they do
now (Markus, 1986). The resulting reports of previous opinions and behaviors are correct to the extent that the implicit theory is accurate. In many domains, however, individuals assume a rather high degree of stability, resulting in underestimates of the degree of change that has occurred over time (Collins, Graham, Hansen, and Johnson, 1985; Withey, 1954), while in other domains (Ross, 1989) respondents who believe in change will detect a change even though none has occurred.
At present, little is known about whether and how subjective theories of stability and change are themselves subject to change across the life span: Which aspects of the self do people perceive as stable versus variable? And at which life stage do they expect changes to set in? Moreover, it is likely that salient life events, like retirement or the loss of a spouse, will give rise to subjective theories of profound change. If so, retrospective reports pertaining to earlier time periods may exaggerate the extent to which life was different prior to the event, conflating real change with biased reconstructions. These issues provide a promising avenue for future research.
An alternative estimation strategy draws on the frequency scale presented by the researcher (for a review see Schwarz, 1999a). In a nutshell, respondents assume that the researcher constructs a meaningful scale. Hence, values in the middle range of the scale presumably reflect the “average” or “typical” behavior, while values at the extremes of the scale correspond to the extremes of the distribution. Based on this assumption, respondents use the frequency scale as a frame of reference in estimating the frequency of their own behaviors. This results in higher frequency reports along scales with high rather than low values. For example, 37.5 percent of a sample of German respondents reported watching TV for more than 2 1/2 hours a day when given a high frequency ranging from “up to 2 1/2 hours” to “more than 4 hours” a day. In contrast, only 16.2 percent reported doing so when given a low frequency scale ranging from “up to 1/2 an hour” to “more than 2 1/2 hours” (Schwarz, Hippler, Deutsch, and Strack, 1985).
Such scale-based estimation effects are more pronounced the more poorly the behavior is represented in memory (Menon, Rhagubir, and Schwarz, 1995). This suggests that the impact of response alternatives may typically be more pronounced for older than for younger respondents. The available data support this prediction, with some important qualifications. As shown in Table 2, Knäuper, Schwarz, and Park (2004) observed that older respondents were more affected by the frequency range of the response scale when asked to report on the frequency of mundane events, such as buying a birthday present. On the other hand, older respondents
TABLE 2 Impact of Response Alternatives on Behavioral Reports as a Function of Content and Respondents’ Age
were less affected than younger respondents when the question pertained to the frequency of physical symptoms, which older respondents are more likely to monitor (e.g., Borchelt, Gilberg, Horgas, and Geiselmann, 1999). In combination, these findings suggest that respondents of all ages draw on the response alternatives when they need to form an estimate of how often they do something. Yet the need to estimate depends on how much attention they pay to the respective behavior, and this attention is itself age dependent.
Note that the scale format used leads to different conclusions about age-related differences in actual behavior from these reports. For example, age differences in red meat consumption (a health-relevant dietary behavior) or the purchase of birthday presents (an indicator of social integration) appear to be minor when a low frequency scale is used, but rather pronounced when a high frequency scale is used. To avoid systematic influences of response alternatives, and the age-related differences in their impact, it is advisable to ask frequency questions in open response formats that specify the relevant units of measurement. While the reports obtained under an open format are far from error free, they are at least not systematically biased by the instrument. See Schwarz (1999a) for a discussion.
As this review illustrates, minor changes in question wording, question format, and question order can profoundly influence the results obtained in sample surveys as well as in the psychological laboratory. Whereas researchers like to assume that respondents know what they believe and do, and can retrieve the proper information from memory, respondents usually have to compute the relevant answers on the spot, and this renders the answers highly context dependent. The underlying cognitive and communicative processes are systematic and increasingly well understood (for comprehensive reviews see Sudman et al., 1996; Tourangeau et al., 2000). Despite the general progress made, however, we know little about the impact of age-related changes in cognitive and communicative functioning on the response process; nor do we understand how age-related differences in the response process may be influenced by individuals’ educational attainment and related variables. The little we do know, however, highlights that the methodological challenge is a serious one: Age-related differences in cognitive resources, memory, text comprehension, speech processing, and communication can have a profound impact on the question-answering process, resulting in differential context effects for older and younger respondents. As a result, any observed age differences in reported attitudes and behavior may reflect (a) a true difference, (b) a difference in the response process, or (c) an unknown mixture of both. If we want to avoid the potential misinterpretation of age-sensitive methods effects as substantive findings, we need to understand how age-related changes in cognitive and communicative functioning interact with the features of our research instruments in shaping respondents’ reports. A systematic research agenda that addresses these issues is likely to advance our theoretical understanding of human cognition and communication across the life span and to improve the methodology of social research.
Baddeley, A.D., and Hitch, G.J. (1977). Recency reexamined. In S. Dornic (Ed.), Attention and performance (vol. 6, pp. 647-667). Mahwah, NJ: Lawrence Erlbaum.
Borchelt, M., Gilberg, R., Horgas, A.L., and Geiselmann, B. (1999). On the significance of morbidity and disability in old age. In P.B. Baltes and K.U. Mayer (Eds.), The Berlin aging study: Aging from 70 to 100 (pp. 403-429). New York: Cambridge University Press.
Bradburn, N.M., Rips, L.J., and Shevell, S.K. (1987). Answering autobiographical questions: The impact of memory and inference on surveys. Science, 236, 157-161.
Collins, L.M., Graham, J.W., Hansen, W.B., and Johnson, C.A. (1985). Agreement between retrospective accounts of substance use and earlier reported substance use. Applied Psychological Measurement, 9, 301-309.
Hasher, L., and Zacks, R.T. (1988). Working memory, comprehension, and aging: A review and a new view. In G.H. Bower (Ed.), The psychology of learning and motivation (vol. 22, pp. 193-225). San Diego: Academic Press.
Knäuper, B. (1999). The impact of age and education on response order effects in attitude measurement. Public Opinion Quarterly, 63, 347-370.
Knäuper, B., Schwarz, N., and Park, D.C. (2004). Frequency reports across age groups: Differential effects of frequency scales. Journal of Official Statistics, 20(1), 91-96.
Knäuper, B., Schwarz, N., Park, D., and Fritsch, A. (2003). The perils of interpreting cohort differences in attitude reports: Question order effects decrease with age. Unpublished manuscript. McGill University, Montreal, QC, Canada.
Krosnick, J.A., and Alwin, D.F. (1987). An evaluation of a cognitive theory of response order effects in survey measurement. Public Opinion Quarterly, 51, 201-219.
Linton, M. (1982). Transformations of memory in everyday life. In U. Neisser (Ed.), Memory observed: Remembering in natural contexts (pp. 77-91). San Francisco: Freeman.
Markus, G.B. (1986). Stability and change in political attitudes: Observed, recalled, and explained. Political Behavior, 8, 21-44.
Menon, G., Raghubir, P., and Schwarz, N. (1995). Behavioral frequency judgments: An accessibility-diagnosticity framework. Journal of Consumer Research, 22, 212-228.
Neisser, U. (1986). Nested structure in autobiographical memory. In D.C. Rubin (Ed.), Autobiographical memory (pp. 71-88). Cambridge, England: Cambridge University Press.
Ross, M. (1989). The relation of implicit theories to the construction of personal histories. Psychological Review, 96, 341-357.
Salthouse, T.A., and Babcock, R.L. (1991). Decomposing adult age differences in working memory. Developmental Psychology, 27, 763-776.
Schuman, H., and Presser, S. (1981). Questions and answers in attitude surveys. New York: Academic Press.
Schwarz, N. (1990). Assessing frequency reports of mundane behaviors: Contributions of cognitive psychology to questionnaire construction. In C. Hendrick and M.S. Clark (Eds.), Research methods in personality and social psychology: Review of personality and social psychology, (vol. 11, pp. 98-119). Thousand Oaks, CA: Sage.
Schwarz, N. (1996). Cognition and communication: Judgmental biases, research methods and the logic of conversation. Mahwah, NJ: Lawrence Erlbaum.
Schwarz, N. (1999a). Frequency reports of physical symptoms and health behaviors: How the questionnaire determines the results. In D.C. Park, R.W. Morrell, and K. Shifren (Eds.), Processing medical information in aging patients: Cognitive and human factors perspectives (pp. 93-108). Mahwah, NJ: Lawrence Erlbaum.
Schwarz, N. (1999b). Self-reports: How the questions shape the answers. American Psychologist, 54, 93-105.
Schwarz, N., Hippler, H.J., Deutsch, B. and Strack, F. (1985). Response categories: Effects on behavioral reports and comparative judgments. Public Opinion Quarterly, 49, 388-395.
Schwarz, N., Hippler, H.J., and Noelle-Neumann, E. (1992). A cognitive model of response order effects in survey measurement. In N. Schwarz and S. Sudman (Eds.), Context effects in social and psychological research (pp. 187-201). New York: Springer Verlag.
Schwarz, N., and Knäuper, B. (2000). Cognition, aging, and self-reports. In D. Park and N. Schwarz (Eds.), Cognitive aging: A primer (pp. 233-252), Philadelphia: Psychology Press.
Schwarz, N., Park, D., Knäuper, B., and Sudman, S. (1999). Cognition, aging, and self-reports. Philadelphia: Psychology Press.
Schwarz, N., and Sudman, S. (1994). Autobiographical memory and the validity of retrospective reports. New York: Springer Verlag.
Sirken, M., Hermann, D., Schechter, S., Schwarz, N., Tanur, J., and Tourangeau, R. (1999). Cognition and survey research. New York: Wiley.
Strube, G. (1987). Answering survey questions: The role of memory. In H.J. Hippler, N. Schwarz, and S. Sudman (Eds.), Social information processing and survey methodology (pp. 86-101). New York: Springer Verlag.
Sudman, S., Bradburn, N., and Schwarz, N. (1996). Thinking about answers: The application of cognitive processes to survey methodology. San Francisco: Jossey-Bass.
Tourangeau, R., Rips, L.J., and Rasinski, K. (2000). The psychology of survey response. Cambridge, England: Cambridge University Press.
Wagenaar, W.A. (1986). My memory: A study of autobiographical memory over six years. Cognitive Psychology, 18, 225-252.
Withey, S.B. (1954). Reliability of recall of income. Public Opinion Quarterly, 18, 31-34.