National Academies Press: OpenBook
« Previous: CHAPTER 2 AFTER THE SEMINAR
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 71
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 72
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 73
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 74
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 75
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 76
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 77
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 78
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 79
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 80
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 81
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 82
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 83
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 84
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 85
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 86
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 87
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 88
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 89
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 90
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 91
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 92
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 93
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 94
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 95
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 96
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 97
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 98
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 99
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 100
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 101
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 102
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 103
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 104
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 105
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 106
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 107
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 108
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 109
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 110
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 111
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 112
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 113
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 114
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 115
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 116
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 117
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 118
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 119
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 120
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 121
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 122
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 123
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 124
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 125
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 126
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 127
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 128
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 129
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 130
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 131
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 132
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 133
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 134
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 135
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 136
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 137
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 138
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 139
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 140
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 141
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 142
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 143
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 144
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 145
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 146
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 147
Suggested Citation:"APPENDIX A BACKGROUND PAPERS." National Research Council. 1984. Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines. Washington, DC: The National Academies Press. doi: 10.17226/930.
×
Page 148

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

APPENDIX A BACKGROUND PAPERS This appendix contains three papers prepared specifically for the CASH project. The first two--nCognitive Science and Survey Methods by Roger Tourangeau and Potential Contributions of. Cognitive Sciences to Survey Questionnaire Designs by Norman M. Bradburn and Catalina Danin--were Rent to participants in advance of the St. Michael's seminar. The third paper, "Record Checks for Sample Surveyed by Kent Marquis, is based on a presentation by the author at St. Michaels. The first two papers focus on the basic questions addressed by the CASH project: What are the potential links between the cognitive sciences and survey research and how can each discipline contribute to progress in the other? However, the authors examine the questions from different perspectives. Tourangeau identifies four kinds of cognitive operations performed by respondents in survey interviews: understanding questions, retrieving relevant information from memory, making judgments (if called for), and selecting responses. He reviews research in the cognitive sciences on each of the four topics and discusses the possible relevance of research findings to survey interviews. In discussing Judgment and response, he Lives special attention to the implications of rinds ngs from the cognitive sciences for attitude questions in surveys. Tourangeau also discusses the possible uses of surveys as a Laboratory for cognitive research and explains how using surveys in this way might help to overcome some of the limitations of small-scale laboratory experiments. He describes one instance in which laboratory-based generalizations have been tested in surveys and briefly summarizes the results. Bradburn and Danis approach the came general subject from the point of view of the survey researcher. The authors first present a conceptual model of response effects, i.e., sources of variation in the quality of data obtained in surveys, and a general model for human information processing. They then contrast the research methods used in the cognitive sciences and in survey research methodological studies. The main part of their paper reviews potential contributions of findings from cognitive research to four of the issues most frequently studied by _ . . . . 71

72 survey researchers: question wording, response categories, contextual meaning' and response to survey items that require information on time or frequency of specific events or transactions. For each of these issues, Rome of the findings from the field of survey research are described, followed by a discussion of possible application of pertinent theories and findings from cognitive research. In the concluding section, the authors caution that the application of findings from cognitive research Into events outside of the laboratory is not direct and requires additional research. ~ The paper by Marquis addresses a topic that is of critical importance in attempting to use surveys as a vehicle [or cognitive research. Laboratory experiments in the cognitive sciences are generally designed in a manner that permits direct, objective evaluation of the success of subjects in performing specified cognitive tacks. For example, subjects may be exposed to verbal material or other stimuli developed by the experimenter and then asked to recall, recognize, or make Judgments about these known stimuli. In survey research, however, the stimuli (in survey research terms, the "truths) are not in general known to the survey researcher. This makes it difficult, in comparing alternative interview procedures, to determine which ones are most successful in minimizing response effects. Attempts to overcome this difficulty come under the general heading of validity checks, the subject of Marquis 'a paper. In validity checks, data are sought from external sources, such an official records, that are deemed to be less subject to response effects than are the survey results. These data, either in aggregate or individual form, are compared with data on the same subjects from surveys. Marquises paper reviews alternative designs for validity checks and discusses their strengths and weaknesses. ~~ ~~ ~ ~ ~ ~ ~ He argues that certain designs lead to wrong conclusions about the direction and size of survey reporting bias and gives some examples from record checks associated with health surveys. He concludes that simple forgetting in not necessarily the dominant problem in health surveys.

COGNITIVE SCIENCES AND SURVEY METHODS Roger Tourangeau In the horse of cognitive science there are many mansions. In one are the carefully controlled studies of the laboratory; in another, elaborate simulations or cognitive processes Dy computer sczentze~. Its subject matter is no less varied than its methods, including topics as diverse as the understanding of language and the inferring of causes, remembering and forgetting, perception, and Judgment. It would take a kind of archit eat of ideas to characterize thin rambling structure. In this review, I shall risk far more undercoverage than any survey researcher. I shall be purposive in my -sampling, fitting the selection to my twin aims: my review will focus on the areas of cognitive science that have the most direct bearing on how surveys are conducted, arid it will focus on one or two arbitrarily selected areas where the cognitive sciences could benefit most directly from the use of surveys. Much that is relevant will be overlooked. The territory is Just too large and too varied for a single foray to include more than a flew salient landmarks. ~ shall attempt to take two points of view in this survey. In attempting to bring the cognitive sciences to bear on survey methods, present a cognitive analysis of the tally of the respondent. In attempting to bring survey methods to bear on cognitive problem, I shell present the case for adding another mansion to the house of cognitive science. ~ examine two particular topics--forgetting and "optimism~--that could, I think, benefit from the use of survey samples. My hope is that the reader will fall prey to the amply documented tendency to overgeneralize from a few concrete canes. The Respondent's Task In the usual interview situation, the interviewer reads the respondent a question and a set of response options. The respondent is supposed to attend to and understand the question, recall whatever facts are relevant, make a Judgment if the question calls for one, and select a response. This list of cognitive operations is by no means exhaustive: for example, with open-ended questions the respondent must formulate an answer rather than selecting one from the pre-existing options. Sometimes a respondent will short-circuit the process, deciding to refuse to answer rather than to retrieve the relevant facts from memory. Nor is the canonical order--cosIprehension, retrieval, Judgment, and response--in~rariably followed. The respondent may select the answer before the interviewer even finishes the question. Still, the respondent must usually go through each of these four stops, typically in the prescribed order. As should be obvious from this description of the respondent's task, there is considerable room for error. Respondents may misunderstand the question or the response categories; they may forget or misremember the crucial information; they may misjudge the information they do recall; 73

74 and they may misreport their answer. Consider respondents who are asked whether they have seen a doctor in the past two weeks. First of all, they may misunderstand the reference period: come may think that the question covers the current calendar week plus the preceding one; others may interpret it to mean the period begining two weeks ago to the day. Even if the respondent does understand which period is being referred to, he or she may find it difficult to determine the exact date bounding the reference period. A study by Elizabeth Loftus (Loftus and Marburger, 1983) indicates that Rome people may even misreport whether they have had a birthday "within the last six months. It seems likely that the source of their error is their inability to translate that phrase into a concrete date to which their birthday can be compared. Even when they understand the question, respondents may forget or misremember the relevant events. Two similar visits to the doctor can be remembered as a single event; Linton's (1982) massive study of her own memory for events in her daily life indicates that the inability to distinguish similar or repeated events is a major source of forgetting. The respondent who understands the question and who recalls the relevant events may still slip up at the judgment stage. A respondent who recalls a visit to the doctor but who can't quite date it has a judgment to make: Did the visit fall within the reference period or outside of it? Such judgments of recency are affected by a number of factors aside from the actual timing of the event. For example, we tend to judge emotionally salient events an more recent. Even when the respondent can correctly retrieve the relevant facts and correctly judge them, he or she may not report them accurately. We may omit some things to avoid embarrassing ourselves, invent others to avoid disappointing the interviewer. We also resolve ambiguities by recourse to what the situation seems to demand. We ask ourselves whether a telephone call constitutes a consultation with a doctor and our answer may depend on how we perceive the relative demands of completeness versus relevance. In the next four sections I examine each of these processes-- comprehension, recall, judgment, and response--in more detail; I consider the main theoretical approaches used in studying them and attempt to draw out the implications of the theories for the practical problems of survey research. Comprehension Research on comprehension has, for reasons of methodological convenience, concentrated on the comprehension of written material. The same principles, however, are assumed to apply to spoken material as well; I refer to materials of both types simply an "text. n There are two main approaches to the study of comprehension. One emphasizes the large cognitive structures the reader or listener brings to bear on the text; the other places somewhat more emphasis on the demands of the text itself. The two views are complementary rather than contradictory. I refer to the first as the top-down approach and to the second as the bottom-up approach. Although I try to sharpen the differences between them, the two approaches share several important

75 notionn--the profound impact of context, the use of prior general knowledge by the reader or listener, and the influence of inferential processes during comprehension. The difference in the approaches lies mainly in their views on the nature of the information we use in interpreting a text. The top-down approach emphasizes large pre-existing structures that can organize an entire text; the bottom-up approach emphasi Zen lower-level structures that can be used piecemeal. Top-Down Processing According to the top-down view, we understand a t ext by imposing a pre-existing organization on it . Until we impose such a structure, we have considerable difficulty in stitching together successive ideas into a coherent fabric (Bransford and Johnson, 1 972 ): First you arrange items into groups. Of course one pile may be sufficient depending on how much there is to do. If you have to go somewhere else due to a lack of facilities that is the next step; otherwise you are pretty well set. It in important not to overdo things. That is, it is better to do too few things at once than too many. . . . After the procedure ze completed, one arranges the materials, into different groups again. They then can be put into their appropriate places. Although each sentence in this passage presents no special difficulty, the passage as a whole is nearly impossible to interpret--until, that is, we learn that the passage in about doing the laundry. After that, the sentences fall neatly together into a tidy conceptual package. With most passages we are able to rind some conceptual package early on in the text . ~ The Branstord and Johnson passage is deliberately constructed to prevent this from happening; it uses general terms--nprocedure, n Biters , n Indifferent group, etc.--where particular terms would have tipped us off.) Once the relevant structure is recognized, each succeeding idea finds its niche in the larger edifice. Comprehension is seen by the top-down approach as a process of recognition: first we recognize the general pattern and then we recognize how each piece taken its foreordained place in the pattern. The patterns or structures are of two types. Some theorists (Mandler and Johnson, 1977; R''melhart, 1975) stress the importance of our knowledge of the general form of texts. We know, for example, that stories consist of settings, beginnings, middles, and endings. We know, further, how each constituent of a story breaks down into smaller constituents (e.g., a setting includes the time, place, and cast of characters) and we know how the constituents relate to each other ( e . g ., the beginning of the story triggers some goal in the protagonist which he or she then attempts to satisfy in the middle part of the story ~ . Our knowledge of the grammar and logic of stories allows us to fit the events of a story into a coherent structure. Other theorists (Abelson, 1 981; Bower et al ., 1 979 ; Schank and Abelson, 1977) emphasize more particular knowledge of the reader, knowledge about stereotypical situations. "Scripted is the label

76 usually applied to our knowledge about such everyday matters as doing the laundry, going to the doctor, eating at a restaurant. We comprehend a text by finding the pertinent script; then we match the ideas in the text with the prototypical events and roles of the -script. The meaning of Bransford and Johnson's passage remains elusive precisely because it evokes no script. Bottom-Up Processing Not everything we read or hear falls neatly into some pre-existing mental cubbyhole ~ Thoreau , cited in Miller , ~ 979 ): Near the end of March, ~ 845, I borrowed an axe and went down to the woods by Walden Pond, nearest to where I intended to build my house, and began to cut down come tall, arrowy white pines, still in their youth, for timber . . . . It was a pleasant hillside where I worked, covered with pine woods, through which I looked out on the pond . . . . The ice in the pond was not yet dissolved, though there were some open spaces, and it was all dark-colored and maturated with water. Miller, taking the bottom-up approach, suggests we interpret this passage by constructing a Memory image" for it (Miller, 1979:204~ : My memory image grew piecemeal in roughly the following way. First I read that the time was late March ; I formed no image at this point, but filed it away for possible use later.... Next, I saw an indistinct Thoreau borrow an axe from an even less distinct somebody and walk, axe in hand, to some woods near a pond . Most of us have never built a log cabin and have only the sketchiest notion of what it would entail. Despite the absence of anything so well- formed as a script, and despite the discursive, unstorylike nature of the text, we have little difficulty in following the passage. We form some sort of picture of what's going on (Miller's memory image) and the details of the passage fit into that picture. All this is not to deny the importance of prior knowledge or of higher-level cognitive structures in the interpretation process: we may not know much about building log cabins but we understand that Thoreau needs the axe for that purpose. Although Miller does not stress the point, we inter a structure of goals and subgoals: Thoreau borrows the axe to cut down the trees to build his house. The bottom-up approach (Kintsch and van Dick, 1978, on Microstructure Foulest Miller, 1979; Miller and Johnson-Laird, 1976) emphasizes the range of the prior knowledge used in interpreting texts. In interpreting Thoreau, we draw upon our knowledge of early spring in New England, of pine trees and axes , of the conditions of 1 845, of log cabins , of whatever, in fact, is needed for us to form a coherent mental image for the passage. It is mainly this data-driven character that distinguishes the bottom-up approach. The story grammar and script

77 theorists tend to focus on a relatively small number of prior structures; when discussing bottom-up processes by contract, theorists tend to be quite catholic in their tastes, pointing out how the reader uses whatever background knowledge may be relevant to the text at hand. The Importance of Prior Knowledge and Context Both approaches to comprehension emphasize our use of prior knowledge--knowledge about the form of texts, knowledge about stereotypical situations, knowledge about concrete details--in interpretation. They also share the notion that context allows us to activate and use the relevant pieces from our vast fund of background information . Without context, we are unable to determine what information is relevant to the passage at hand . With changing contexts, we draw on radically different pieces of information in the comprehension process leading to radically different readings. If Raskolnikov, rather than Thoreau, had borrowed the axe, we would draw rather different conclusions about his purpose. Prior knowledge is used to connect the ideas in a passage. In Thoreau's account, we saw how different actions are imbedded as goals and subgoals. We fill in the linking connections and the omitted details. Millers image of Thoreau's passage may be hazy in some respects, but it includes more detail than the passage strictly warrants--he infers a man, for example, perhaps snow on the ground. His reading honors both the claims of the text and the claims of his own knowledge of the world. Implications for Survey Methods There are several main themes that both approaches share. We go far beyond the information given in interpreting a text; we fill in gaps and add details, making inferences that our background knowledge and the text at hand seem to call for. The inferences we make depend on the prior knowledge that is activated by the passage and its context; context guides the selection of the relevant prior knowledge. The work on comprehension has focused on connected text rather than discrete items, on exposition rather than interrogation, on written rather than spoken prone. The relatively little work that has been done on answering questions has also tended to point up the importance of prior structures--such as scripts and story gr~mmars--in guiding the processes by which we seek the information that the question requests (see Bower et al., 1979, and Mandler and Johnson, 1977~. Despite the differences between an interview and a prose passage, there are a few generalizations that probably apply in both situations. First, the inference process can go awry--we may incorrectly infer what is only possible or at best probable. The problem is compounded because we do not sharply distinguish probable inferences (e.g., Thoreau's goal in borrowing an axe) from necessary ones (e.g., that going to the site required the narrator to move). Second, the context of a sentence (or question) will to a large extent determine the nature of the inferences

78 drawn, the scripts invoked, and the background knowledge brought to bear in interpreting it. In more practical terms, the literature on comprehension suggests that, other things being equal, related questions that are grouped may have the advantage over questions that are scattered throughout an instrument, because the grouped questions provide a helpful context for interpretation; longer questions may have the advantage over shorter ones, because longer questions create more context; explicit questions may have the advantage over questions that rely on tacit knowledge, because they leave less room for erroneous inferences; questions tailored for different subgroups may have the advantage over more uniformly standardized questions, because different subgroups have different stores of background information that can lead to different interpretations. Each of these generalizations in an oversimplification; they are intended as guidelines rather than hard-and-fast rules. The advantages of long questions, for example, may be offset by the increase in syntactic complexity that usually accompanies increased length. Survey researchers are hardly unaware of many of these points. Bradburn (1982) describes several methodological studies on context and question order effects. One of the findings apparently illustrates how respondents may draw erroneous inferences about the meaning of a question. In some cases, the answer to a general question (such as Taken all together, how would you say things are these days--would you say you are very happy, pretty happy, or not too happy? may change, depending on whether it comes before or after specific questions (e.g., about one's marriage). Bradburn interprets this result to mean that when the general question comes lash people infer that it covers only what has been left out of the particular questions. Their inference may stem from a general conversational rule that tells us to avoid unnecessary redundancy in our answers (Haviland and Clark, 1974~. Other studies (Belson, 1968; Fee, 1979) show how differences in background can lead to differences in how a question is interpreted. Even terms in common use (such an Energy crisis") have a wide range of meanings; different kinds of people tend to favor different interpretations of them. Oral Presentation Cognitive scientists have concentrated their energies on the comprehension of written materials; it is natural to wonder how far we can apply their conclusions to spoken ones. Much of the work that has been done on the differences between reading and listening has come under the banner of research on attitude change, where the concern has been to compare the effectiveness of oral and written arguments. There is no clear winner. Studies suggest, for example, that oral presentation may be better for simple arguments but worse for complicated ones (Chaiken and Eagly, 1976~. Other processes besides comprehension tend to be implicated in these ~tudie~--oral presentations may be more effective simply because they are more likely to commend our attention, but it may be easier to understand written presentations if we bother to read them .

79 Mode of presentation may interact with other variables besides complexity. Oral presentation can produce gains in short-term memory performance, but the increaser seem to be limited to the last few items in a series. Some phenomena are likelier to arise when material is presented orally--homonyms present little difficulty when read, and channel discrepancies between the verbal and nonverbal messages are unlikely to occur with material presented in writing. On the whole, however, there is little research on how the mode of presentation affects comprehension. There Is a fair amount of evidence that memory for spoken language i-Q generally poor (see, for example, Reenan et al., 1977), although not necessarily worse than memory for written language. (Eeenan et al. also suggest that we may remember the irrelevancies, the asides, and the Joker better than the gist of the presentation.) Given the typical length of survey questions, forgetting the question is probably a leas common source of respondent error than misunderstanding it. Retrieval Having interpreted--or misinterpreted--the question, the respondent now faces the problem of answering it. Almost all questions require us to rummage about our memories in search of an answer. The fallibility of memory is, of course, more widely appreciated than the fallibility of the comprehension process. This section takes up four issues: What do cognitive scientists say about the structure of memory and the processes used to search for information? What are the sources of forgetting? What can be done about them? When can memory be trusted? Memory is arguably academic psychology's oldest topic, systematic work having begun more than a century ago with Ebbinghaus. Like all topics, memory has had its ups and downs, but it has never fallen completely out of favor, not even during the behavioristic reign of terror that sent so many mentalistic topics into banishment. It would be pointless to pretend that the vast literature on memory can be summarized in a few broad strokes. It cannot. The best I can hope to do is to highlight a few key ideas that relate most directly to the problems faced by survey researchers. Episodic Versus Semantic Memory Hemory researchers distinguish between different stores for information, ranging from the very brief persistence of visual information in iconic storage to the more or less permanent retention of information in long-term memory. Generally, three stores are distinguished: sensory memory (which is thought to be very short lived); short-term memory (which corresponds, in some accounts, roughly to the active contents of consciousness); and long-term memory. Tulving (1972) has introduced a further distinction between semantic and episodic memory. Episodic memory contains our memories for experience, for events that are defined by a particular time and place; semantic memory contains more general

80 knowledge, such as our knowledge of the meanings of words, that is independent of the setting in which it was learned. Our memory for a particular text generally resides in episodic memory; the background knowledge used in interpreting the text generally resides in semantic memory. Tulving ( 1968; Tulving and Thomson, 1973) has argued that the context in which an event is experienced or a word is encountered determines how it in interpreted, stored, and retrieved from episodic memory. His encoding specificity principle states that sometimes we [ail even to recognize a previously learned item because at recognition the item [ails to reinstate the exact encoding it was given during learning. For example, when one puts eggs on a shopping list, he or she may fail to purchase the right item if candy Easter eggs were intended and hen's eggs were recalled. Although come psychologists (J. Anderson, 1976) question Tulving's distinction between semantic and episodic memory--arguing that both sorts of memory are retained in a single store and follow the same principle~-- few question the importance of context in encoding and recall. Memory as an Associative Network Our long-term memories are not long lists of all the facts we have learned and the events we have experienced. Memories are connected to each other and the connections can be long and complicated or short and direct. Some theorists (J. Anderson, 1976; Collins and Quillian, 1972) view ~ong-term memory as an associative network. They view individual ideas as nodes or points of intersection in the network; the ideas are connected by links, which correspond to the relations between ideas . The network analogy is u Ireful for a number of reasons . First, it in compatible with most models of language comprehension; individual sentences or passages of text can be represented using the same network formalism. Second, it allows us to understand how items can be retrieved and forgotten from memory. According to Anderson, for example, we remember by searching relevant portions of the memory network. We forget items that are relatively isolated--tew paths in the network lead to the item sought--or whose connections to other items are poorly learned. This perspective implies that good cues for remembering something are those that lead us to search the right part of the network; in most cases, the best cue is the item itself . It comes as no surprise that recognition--recalling whether an item has been presented before--is almost always better than other forms of recall. Retrieval as Reconstruction We noted earlier that when we read a passage, we make inferences; we supply connections that are implicit in the text and we add plausible details. There is an abundance or evidence that we encode and store our interpretation of a text rather than the text itself ; our memory for gist is far better than our verbatim memory (Bransford et al., 1972; Sachs,

81 1967). Not only do we lose verbatim details, we also seem to add (and recall ) plausible details that fit our interpretation (Bartlett, 1932; Bower et al., 1979; Bransford et al., 1972; Cantor and Mischel, 1977; Handler and Johnson, 1977 ~ . It is not always clear whether we make the inferences when we first encounter the material or when we remember it later. Some theorists argue that we may do much of the "filling in" at the time of recall . What we recall, of course, is likely to be incomplete or spotty; we may round out what we can remember with what we can infer. Once we draw an inference--during our initial encoding or later during an attempt at retrieval--it can be added to our representation of the event. LofLus and her colleagues have conducted several demonstrations on thin point. In one (Loftus and Palmer, 1974, Experiment II), the subjects watched a film of a traffic accident. Afterwards, some of them were asked the question, "About how fast were the cars going when they smashed into each other?n lathe rest were asked a similar question with the word "hits replacing "smashed into." The ~smashed" subjects were more than twice as likely later on to report incorrectly that they had seen broken glass in the film. The subjects drew such inferences as "smashed into" seemed to warrant; later they could not disentangle what they had seen from what they had inferred. Sources of Forgetting As with research on comprehension, so with research on memory: it is more than a little hazardous to extrapolate from the sort of research conducted by cognitive psychologists to the practical concerns of the survey researcher. The vast bulk of the research on memory has concerned the memory over short periods of time (typically a few minutes) of arbitrary lists generally consisting of nonsense syllables. Although my review has mainly covered research on more meaningful materials, such as text or filmed events, it is still a long way from laboratory studies to everyday memory. Fortunately, Linton's (1982) study of her own everyday memory suggests that the memory problems encountered in the research laboratory are similar to the ones encountered elsewhere. The research I have attempted to summarize pinpoints several d' {ferent reasons for forgetting. One reason for forgetting is that we may not have transferred the relevant information into lon8-term memory. We do not attend to everything and, even when we do, we often attend in a rather mindless fashion. To find out whether one has already purchased the morning newspaper, it may be better to search one' s briefcase than one's memory. Another reason that we forget things ze that we simply cannot retrieve them. Linton ~ ~ 982 ~ refers to this type of forgetting as Simple failure to recall. ~ Some representation of the event made it into long-term memory--in Linton's case, she could often recall such items for months or years--but gradually it becomes inaccessible, lost somewhere in the associative network. Linton describes another type of retrieval failure: "During the fourth and subsequent years tof the study] I began to encounter a few old items that simply di d not 'make sense. ' ~ These items--brief descriptions of events that she had written shortly after the events had taken place--no longer functioned as

82 interpretable cues for recall. The initial interpretation of the event ceased to have meaning outside of its original context. A number of researchers--beginning with Tulving and his encoding specificity princip~e--have argued that the value of a recall cue depends on its ability to reinstate the original interpretation (i.e., encoding) of an event. A cue that receives one interpretation in its original context and another in the context of recall will be a poor cue indeed--it leads us to search the wrong part of memory. lathe converse of this principle is that overlap between features of the original context and the context of recall will facilitate recall. In an intriguing review of the literature, Bower (1981) has shown that memory for lists of words, daily events, and childhood experiences in better when the mood during recall is consistent with the mood during the experience. Linton's study highlights another source of forgetting--over time, we may lose the ability to distinguish between similar events. It is easy enough to recall one's only trip to St. Michael~, far harder to recall one's trip to the office on March 14. The problem in recalling specific instances of repetitive events ze that we forget the particulars and retain the pattern; we can reconstruct the essentials based on the pattern but this strategy offers no basis for recovering the individuating details that would allow us to distinguish one trip to the office from another. In addition, details sometimes seem to wander from one instance of rescripted" event to another (e.g., Bower et al., 1979, Experiment 3.) Details may also be inferred, based on the pattern, even though they were absent from the particular instance (Bower et al., t979, Experiment 4; ct. Handler and Johnson, 1977; and Cantor and Mischel, 1977) ~ In sundry, there are several processes of forgetting: ( 1 ) The information may never reach long-term memory; (2) it may be irretrievable, particularly when the context of recall differs sharply from the original context of the event; (3) it may be hard to distinguish from related information; (4) it may be tainted with intrusions and inferences made while or after the original information wan learned. Aide to Memory It is no news to anyone that ~neo~ory is fallible. Can anything be done to improve it? A fair amount of research has examined procedures to improve memory that are applied at the initial encoding stage. Although a number of tactics have proved to be effective (such as "deeper initial processing, the use of mental imagery, and the application of mnemonic tricked, they are, from the point of view of the survey researcher, of not much use--they require the respondent to do something special at the time of encoding and, of course, the researcher is usually not yet on the scene. In longitudinal studies and studies using diaries of course there are more opportunities to train respondents in the use of these mnemonic tactics. Still, tricks that can be applied at the retrieval stage long after the relevant events have taken place are of more general use to survey researchers. Not much is known about this kind of mnemonic trick. Erdelyi and Rleinbard ~ 1976 ~ suggest that when it comes to

83 retrieval more is better: repeated attempts at recall yield more of the items sought. In addition, the more time we have to recall an item, the more likely we are to retrieve it. For this reason, longer questions, which may have an advantage at the comprehension stage, may also have the edge at the retrieval stage. (On the other hand, these advantages may be dissipated if greater length is accompanied by greater syntactic complexity.) Bower 'a (1981) work suggests that efforts to reinstate features of the original context will also yield fuller recall. Left to their own devices, people tend to favor "externals memory aids (such as calendars, diaries, and so on) over the mnemonic tactics usually studied by memory researchers (Harris, 1978~. Survey researchers have not ignored the possibilities of using special diaries, logs, and so on for improving recall. Also, some surveys have incorporated existing records as a Jog to respondents' memories; a respondent's checkbook may contain the most useful cues to help him or her recall a visit to a doctor. On the whole, people are reasonably aware of the shortcomings of their memories and of how to overcome them Bell and Wellman, 1977~. Forewarning them about what they will be asked to recall may be the easiest way to improve their memories. What Can We Remember? Not everything is remembered equally well or poorly; it is probably worthwhile to list some kinds of experiences that are likely to be especially well remembered. First of all, a number of researchers have argued that emotional events are unusually well remembered (see, for example, Brown and Kulik, 1977; Holmes, 1970; Sheingold and Tenney, 1982; Colegrove, 1 898, is an interesting precursor to the Brown and Rulik work on "flashbulb" memories) . Linton ( 1982) suggests that emotional events are best remembered when they ( 1 ) were emotional at the time they occurred, ~ 2 ) mark transitional points in one ' s life, and ~ 3 ) remain relatively unique, to which Bower (1981) might add, (4) they retain their original emotional significance. Research on the reliability of eyewi tress testimony, on the other hand, suggests that memory for emotional events may be especi ally prone to distortion. It in not clear how to reconcile these two lines of research. The key issue may be what happens between the event and the attempt to recall it. Emotional events do not pass unremarked; they are told and retold to friends and relatives, and embellishments that fill in or improve the story may be incorporated into the memory for the event, producing distortions during recall . Owing to the distortions, it may be as hard to remember the details as it is to forget the event. Other kinds of events have a clearer advantage in memory. Events that stand out against the background of a script or stereotype because they are anomalous in some way tend to be remembered (Bower et al., 1979, Experiment 7; Hastie and Rumar, 1979~. In addition, the first few and last few events in a series will be better remembered than events in the middle; drawn-out events will be remembered better than brief ones. We also recall certain aspects of events better than others: "essentially better than details (this may reflect the ease with which the essentials

84 can be reconstructed), actions and their outcomes better than their settings or motives, causally connected sequences better than merely temporal ones Manger and Johnson, 1977~. Judgment Many types of questions require considerable cognitive work from the respondent after the relevant facts have been retrieved. The question may call for a judgment that requires several pieces of information to be combined; it may call for come inference. Hastier recent review (Hattie, ~ 983 ~ distinguishes several major approaches to the topic of social ~ nference. After considering two of these approaches, I explore how different kinds of questions will lead to the use of. different judgment strategies. I then examine in more detail the process by which attitude questions are answered. Information Integration Theory Although largely a one-man ~how, information integration theory (N. Anderson, 1974, 1981 ) has nonetheless had a profound impact in a number of areas within social psychology, ranging from impression formation to equity judgments. The essential idea in simple but powerful: when people must make judgments about something, they combine pieces of information using simple algebraic rules. For example, if we know that someone is a Republican banker, then we judge him to be likeable according to how likeable we find Republicans and how likeable we find bankers. Our judgment about Republican bankers is, according to Anderson, a kind of average of our separate feelings about the two categories. In rendering a judgment--such as a likeability rating--we evaluate the individual pieces of information we have and then we integrate them. The integration is likely to follow come simple rule such as multiplication, addition, or, in thin case, averaging. The nuances of the theory involve how we evaluate the individual pieces of information, how we weight them, and when we use one integration rule instead of another. Despite its apparent simplicity, the theory is remarkably adept at explaining a range of phenomena. Jones and Goethals (1972), for example, show that the same information can lead to different Judgments depending on the order in which it is presented. (In line with nearly everyone's suspicions, first impressions do seem to carry inordinate weight.) N. Anderson (1974) has argued that this results from waning attention to later items and from the tendency to discount later information when it contradicts what is already given. In the language of integration theory, the weight an item receives depends upon its serial position. Anderson has applied the theory in too many domains to cover in any detail. The key points to bear in mind are that (~) people readily evaluate diverse pieces of information on a common scale, and (2) they seem to combine the information according to simple formulas.

85 Judgmental Heuristics Anderson's work points up the precision winch which we can render Judgments. lathe main alternative to information integration theory, the judgmental heuristics approach, presents the contrasting view that we apply loose rules of thumb in rendering many judgments; the rules of thumb Heuristic" often lead us far astray. Eahneman and Tversky (1971, 1973; Tversky and Kahneman, 1973) have identified three such heuristics--availability, repre~entativeness, and anchoring and adJustment--and shown how they can lead to systematic errors in judgments of frequency and likeli hood. We use the availability heuristic when we judge the frequency of a class of events based on the number of examples we can generate or the ease with which we generate them. For example, people incorrectly .judge that words beginning with the letter R are more frequent than words whose third letter is R--pre~umably because they can more easily call to mind words that begin with R. Similarly, people may overestimate the likelihood of memorable but rare events because examples are called readily to mind. Representativeness refers to the tendency for Judgments regarding category membership to be rendered solely on the basin of resemblance to some prototype or underlying process. For example, people regard boy-girl-boy-girl-girl-boy as a much likelier sequence of births than three boys followed by three girl~--the first sequence seems more representative of the underlying random process. Or they will Judge a person described as quiet an likelier to be a librarian than a salesman-- even when they are told there are tar more salesmen than there are librarians. The representati~renes~ heuristic leads to errors because it fails to incorporate the statistical considerations--such as base rate information--that are also relevant to the judgment. We show little hesitancy about generalizing from a few canes if the cases seem representative of the underlying process or category. Small, biased samples present few problems for nonstatistical inference. A third Judgmental heuristic is anchoring and adjustment. The anchor refers to come starting point or preliminary estimate; this initial judgment is then adjusted to produce the final verdict. Problems arise when the adjustment is insufficient or the anchor misleading. For example, when asked to calculate 10 x 9 x ~ x . . . x 2 x ~ in their heads, people appear to multiply the first few numbers and, using this as their anchor, adjust the result upward. They come to a consistently higher conclusion than people asked to multiply ~ x 2 x 3 . . . x 9 x 10, who presumably begin their adjustment from a lower anchor point. Both groups tend to underestimate the actual answer, suggesting that the ad justment is in~uffi cleat in either case. A more dramatic example of the inappropriate use of anchoring and adjustment is provided by Ross, Lepper, and Hubbard (1975~. Subjects in the study were misled about their performance in distinguishing authentic suicide notes from fake ones. Some subjects were told they had performed much above (or much below ~ average; later on, the experimenter admitted that the evaluation had been a complete hoax. Despite the di screditing of the evidence on which their opinions were based, "successfully subjects continued to

86 believe they were better than average at the task and "failures subjects continued to believe they were worse than average. The subjects apparently underadjust their opinions in the light of the discrediting of the feedback. Types of Questions In the dozen years since Rahneman and T~er~ky burst upon the psychological scene, the number of heuristics and biases that we have been shown to exhibit has grown at something like an exponential rate. Studies indicate that we persevere in the [ace of discrediting evidence (Ross, Lepper, and Hubbard, ~ 975~; ignore base-rate information (Rahneman and Tversky, 1 97 1; Nisbett and Ross , ~ 980, Chapter 7 ); generalize from small or biased samples (Tver~ky and Kahneman, 1971 ); make extreme predictions based on invalid indicators ~ Eahneman and Tversky, 1 97 3 ); perceive Illusory correlations that sustain our prejudices ~ Chapman and Chapman, 1969; Hamilton and Gifford, 1976~; search for evidence in ways that can only confirm our views ~ Snyder and Swann, 1 978 ); believe in our control over chance events (Langer, 1975 ); attribute causal Significance to whatever happens to be salient during an event ~ Taylor and Fiske, ~ 975; McArthur and Post, ~ 977 ); ascribe all Forts of personal characteristics to people who are clearly doing only what the situation requires ~ ct. Ross, ~ 977, on the "fundamental attribution errors ); "recognizes new pieces of information that confirm our stereotypes (Cantor and Mischel, 1977~--and, in spite of all these failings, we remain confident that our Judgments are sound (Einhorn and Hogarth, 1 978 ~ . The list of our - shortcomings could be easily be extended; it can be as long as the reviewer is persistent. Still, it is hard to believe that our judgment is as bad as all this might suggest; we must occasionally get things right. Something of a backlash has already set in ~ see, for example , Lof~tus and Beach , 1 982 ); in their comprehensive review, Nisbett and Ross ~ ~ 980) take pains to argue that, despite the thousand natural errors that the mind is heir to, we often do make valid judgments and sound decisions. Implications for Survey Research It would be nice for everyone if' we could reduce this long list of biases to one or two fundamental errors from which the rest could be derived, but no one has thus far succeeded in imposing order on this error-tilled crew. Failing that, it would be useful it we could identify the particular judgment tasks chat typically elicit particular strategies with their characteristic shortcomings, but no one has succeeded even in that endeavor. Nisbett and Ross ~ ~ 980 ~ do manage to relate different errors to different steps in the judgment process (e.g., gathering evidence, summarizing it, etc.) but particular judgment tasks differ greatly in which steps they require. Sykes (1982) distinguishes four types of questions commonly used in surveys: those that request factual or behavioral information; those

87 that assess the re-~pondent's awareness or knowledge of a subject; those that elicit attitudes or opinions; and shone that call for a reason or explanation. Questions about what we have done and what we know are probably the least susceptible to Judgmental error--although judgments of the frequency or recency of a behavior are known to be distorted in predictable ways. (Most of the Rahneman and T~er~ky work bears on the issue of frequency judgments.) Questions about the causes or reasons for behavior, on the other hand, require considerable Judgment on the part of the respondent , and the Judgment process seems to be very [[awed (Ross, 1977~; what is more, people often seem to use the same process whether they are explaining their own behavior or someone else 'a (Bem, 1967, 1972; Nisbett and Wilson, 1977--many of our explanations appear to be based on commonsensical notions of why people behave the way they do rather than on access to some fund of private knowledge. Ntsbett and Wilson (1977) argue that if we do know ourselves better than others know us, it is largely in the historical sense that we remember how we have behaved in the past. Attitude Questions Attitude questions are an interesting case because very little is known about how we answer them. The process in likely to be quite complex, involving both the use of judgmental heuristics and the application of integration rules. Attitudes have been shown to be Entire to a range of influences: beliefs (A jzen and Fi~hbein, t972; Rosenberg, 1956~; values (Rokeach, 1971~; norms (A]zen and Fishbein, 1972~; feelings (Abelson, Kinder, Peters, and Fiske, 1982~; other attitudes (Abelson et al ., 1 968 ); behaviors (Freedman, 1 965 ); and arguments ~ The exact nature of the relationship between attitudes, on the one hand, and beliefs, values, norms, emotions, and behaviors, on the other, is often hard to disentangle; behaviors, for example, are clearly affected by attitudes, but the reverse in also true, and both attitudes and behaviors are affected by still other things, such as norms. It is tempting to conclude that we must answer attitude questions by retrieving the relevant beliefs, values, behaviors, and so on from memory and then making a judgment based on whatever is brought to mind. Unfortunately, we know that many of the things that affect attitudes-- such an persuasive arguments--are quickly forgotten, though their impact lingers on (Hess and Linder, 1972~. (A more familiar version of the phenomenon is knowing that we have heard convincing evidence on Rome issue without being able to recall what the evidence is.) If we do not retrieve evidence in answering attitude questions, then what do we retrieve? There are probably different answers to this question depending on the nature of the attitudes we hold. Let us first distinguish well-formed attitudes from "nonattitudes~ (Converse, 1963~. There are many topics on which we have no particular opinion--we may have some beliefs but they will be weakly held and poorly supported; we may have some feelings but they will be inconsistent and changeable. With nonattitudes, we are likely to retrieve the one or two salient pieces of information that we have--perhaps a cliche we have heard about

88 the topic or some example of our behavior. Having sampled our memories in thin haphazard way, we escaper what we have retrieved; we ask ourselves "what are the evaluative implications of this cliche, that behavior? We then combine the eve] cations assigned to each piece of information--perhaps we average them or perhaps we use whatever comes to mind first an an anchor and then adjust our Judgment as we recall and evaluate additional bits of informs Lion. With nonattitudes, then, answering an attitude question has three stages: we sample our views, we scale them, and then we combine them. Since the views we hold are so poorly interconnected, what we retrieve on one occasion may bear little resemblance to what we retrieve on the next. We must expect nonattitudes to be quite unreliable. [here are also topics about which we have passionate feelings or deeply held beliefs. It is probably useful to distinguish several categories of such well-formed attitudes: ~ 1 ~ attitudes that consist of well-developed stereotypes ; ~ 2 ~ attitudes that relate to deeply felt, symbolic concerns; (3) attitudes that relate to immediate, practical issues. Research on stereotypes is hardly new but it has taken on new impetus recently because of the interest in the internal structure of categories (Rosch and Lloyd, 1978; see also the work on schemata by Markus, 1977; Rumelhart and Ortony, 1977~. According to the schema theorists (Rumelhart and Ortony, 1977; Ortony, 1979), our conception of a category may consist of a kind of frame (the schema) with a series of "slotan for the details that distinguish one category member from another. Lee slots may contain ~default" values, reflecting our sense of what the prototypical category member is like. With socially significant categories, one of the clots may well contain our evaluation of ache instance or a default evaluation that reflects our judgment of the category as a whole. Thus, it is plausible to suppose that when we ask a Marxist how well he or she likes bankers, the question activates a dog- eared stereotype or schema that includes an evaluative slot with the information that bankers are capitalists of the worst kind . Similarly, we may store evaluative reactions alongside leas emotionally charged information in our memory representation for individual category members such as particular political f.l gures ~ Abelson et al ., ~ 982 ~ . Schemata do not come into play only with familiar social groups. Attitudes about abstract political issues often take on a schematic cast. For example, research on attitudes towards school busing (Kinder and Sears, 1981) suggests that attitudes on this emotional question are leas a function of practical considerations (such as whether one's own children are likely to be bused) than of symbolic ones. People's attitudes on the busing question are most easily predicted from their attitudes about apparently unrelated issues, such as "welfare. ~ It is an if people saw these issues as manifestations of some larger pattern; in the case of busing, opponents seem to see it as an example of a general schema in which liberal "reforms pose a threat to traditional American values . In the case of symbolic attitudes, then, what gets retrieved in answering an attitude question is a kind of schema that includes deeply felt emotions.

89 With come issues, practical considerations do seem to have the upper hand. Surely some of the people who favored the recent tax cuts favored them because they had a good deal to gain from them. People who hold such practical attitudes are probably more open to attitude change than people holding symbolic attitudes or attitudes based on stereotypes. This suggests that questions regarding practical attitudes may be answered through a process quite different from the one used when stereotypes or symbolic issues are involved. instead of retrieving Rome schema with its ready-made evaluation, we may retrieve the relevant evidence (beliefs one holds , arguments one has heard ) , evaluate it , and combine it to render our judgment. This process is quite similar to the one proposed for nonattitudes, only the memories activated are far more detailed. These four types of attitudes are meant an pure cases. Many attitudes doubtless include elements of both symbolic significance and practical calculation; one man's cliched nonattitude is another can't stereotype. Still, it is worth noting that with come attitude questions we must compute an evaluative Judgment; with others, we simply retrieve one. The form of the attitude question may also influence whether we employ the one process or the other. Likert-type items, which require us to rate how much we agree or disagree with some attitude statement, may encourage the retrieval and integration of more detailed information from memory than items calling for a simple evaluation. The best way to assess attitudes may depend therefore on the type of attitude being assessed. Response lthe respondent 's task is not quite finished. Having rendered the Judgment that the question demands, he or she must now select (or formulate) a responses This section deals exclusively with the selection of a response from a pre-established Ret of response categories. Open-ended questions require respondents to formulate their answers, a process too complicated to corer here. This section deals with two mayor i~sues--how respondents select their answer and when they misreport it. Response Selection To select a response is to make a choice. Psychologists have studied a wide variety of decision rules to cover a range of choice situations. One of the most intensively studied rules--the Luce choice rule (Luce, ~ 959 )--underscores the [act that raking choices is a chancy business; confronted with an identical set of alternatives, we do not always select the same one . According to the Luce choice rule, thin may occur not because we evaluate the options any differently but because the decision process is inherently probabilistic. The Luce choice rule says that we assign a value to each option; the probability of selecting a particular option is the proportion its value represents relative to the total value

go assigned to a'1 of the options. Although the Luce choice rule has been successfully applied in a variety of settings (e.g., it has been proposed as the method by which we select ache answers to analogy problems, Rumelhart and Abrahamson, 1973), it is not an intuitively appealing model for response selection in surveys. A more intuitive procedure has been suggested by Tversky ~ ~ 972 ) , who argues that we make choices by eliminating the options that lack come desirable feature or aspect. Like the Luce choice rule, Tver~ky ' ~ model incorporates a chance element: our final decision depends on which aspects we select an a basis for narrowing down the options, and this selection process is partly a matter of chance. Some decision models have been proposed specifically for memory tasks such as deciding whether one recognizes an item. These models assume that whether one accepts an item as "old" depends not only on its level of activation (its feeling of familiarity) but also on one's criterion for accepting an item as old. The criterion will be influenced by the relative probabilities and coats of the two types of possible errors (failing to recognize an old item, falsely recognizing a new one). (See Bower et al., 1979, for another recognition decision model based on the level of. activation. ~ The Luce rule and Tversky's elimination by aspects model are useful in pointing up the chance component in the choice process. The models of choice applied to memory are useful in underscoring another point--the role of nonfactual considerations in response selection. Respondents with relatively low ncriterian may report events they recall in only the vaguest of terms because they feel that it is better to give come response, however inaccurate, than none. Similarly, respondents may report an attitude simply to avoid the minimal embarrassment of admitting that they do not have one--it is probably not difficult to produce an attitude judgment on demand. Responses to even the most factual of items are influenced by considerations besides the facts. Many models of choice assume that the respondent has some ideal in mind and then selects the option that is the closest approximation to it. Attitude scaling techniques--such as Thurstone scaling and Coombs's unfolding technique--prenuppose the existence of a unitary, underlying attitude dimension on which the respondent's ideal point and the various attitude items can be positioned. These models alert us to the problem of ~nonscale~ respondents, for whom the dimension is not meaningful, and of extreme ones, who lie outside the range of the options provided. In many situations, we are willing to settle for something that is considerably less than ideal; often we demand no more than what will suffice. The widespread use of ~satisficing~ (Simon and Stedry, 1969) rules creates some headaches for the survey researcher. Respondents, like the rent of us, probably do not always wait patiently until all the options have been laid out; they may leap at the first option that seems satisfactory, ignoring the rest. A number of techniques can be used to reduce this tendency--placin8 normatively less desirable options first in a series, shifting the order around to force the respondent to attend to all the options, announcing the options ahead of time.

91 Reporting Biases Although there is a fair amount of evidence that we describe ourselves in more glowing terms than others do, it is not clear why this is so. We may believe our favorable self-de~criptions, or we may merely hope that others believe them. The process may be conscious and deliberate or unconscious and unintended . Besides these ~ self-serving~ biases (Nisbett and Rose, 1980), which are as well-known to survey researchers as to cognitive scientists, there are probably a number of subtler reporting biases . For example, people tend to exhibit consistency among their attitudes (Abelson et al., 1968) and between their attitudes and behaviors ~ e ~ g ., Salancik and Conway, ~ 975 ); pressure toward consistency seems to be induced merely by asking questions (McGuire, 1960~. While consistency pressures can result in long-lasting attitude change (Freedman, 1965), they can also produce short-li~red and, from the point of view of the survey researcher, artitactual changer. We strive to present ourselves in a favorable light, and we strive to be consistent. We are prey to other sources of distortion in our responses: we select socially desirable responses ~ Crowne and Marlowe, ~ 964 ), we fulfill the investigator ' ~ expectations (Rosenthal and Jacobson, 1966 ), and we meet the perceived demands of the situation ~ Orne, 1 972 ~ . These sources of response error are no news to survey researchers, who probably know more about reducing their effects than cognitive scientists. I shall not dwell on them here. Survey Research as Cognitive Laboratory All of the foregoing has been an attempt to show that the cognitive sciences have something to offer survey methodology. It can point out sources of potential errors and methods for reducing them. This section takes the opposite tack and examines what survey methods have to offer cognitive scientists. It in perhaps somewhat surprising that survey research in not already among the mansions in the horse of cognitive science. Certainly, the problems that cognitive psychologists study can often be studied as readily in the field with survey samples as in the laboratory with college students. Sometimes there is a remarkable convergence, as when Roes et al . ~ ~ 978 ~ and Fie' ds and Schuman ~ 1 976 ~ nearly simultaneously discovered the same phenomenon--our tendency to see our own opinion as relatively common--using laboratory and survey methods. Investigators are rarely trained in both sets of methods, and they tend to frame their questions in ways that lend themselves to investigation by the methods that they know the best. Researchers are no lens susceptible to the unsystematic sampling of alternatives than anyone else, so it is really no surprise that they should tailor their research to the familiar methods that spring to mind. There are costs to letting the tail wag the dog in this way. In the case of cognitive psychology, critics, such as Neisser ~1 982 ), have charged that the research tradition consists primarily of studies involving impractical problems performed in unnatural nettings by

92 unrepresentative samples. In reviewing the literature on comprehension, I no~ced that the research centers almost exclusively on connected expository text, which makes it difficult to draw conclusions about discrete spoken questions. The same sort of observations can be made about research on memory; the bulk of work concerns memory for arbitrary lists, often composed of nonsense syllables. Neisser (1976, 1978) argues that much of this work lacks "ecological validity, since the tasks and settings are far removed from anything that exists outside of the laboratory. Of course, in some sense, the whole point of laboratory research is to create artificial situations where the effects of variables can be isolated from each other. It is nonetheless hardly illegitimate to wonder whether laboratory tasks really capture the essence of the real-world problems they are designed to model. The most convincing answer to the critics in to demonstrate that the principles discovered in the laboratory do in fact apply in other settings and we th other populations. Some problems do not easily lend themselves to laboratory methods: it is difficult to study disaster, passionate love, or schizophrenia by attempting to simulate them in the controlled setting of the laboratory. The bent that the laboratory researcher can do in to create some weak or ~acute" version of the the phenomenon of interest--and to hope that it remains the same phenomenon. The researcher's conflicting aims are to reproduce the phenomena closely enough to be convincing but not so closely as to be unethical. Large survey samples can give access to phenonema and populations that are outside the range ordinarily available to the cognitive psychologist. Two phenomena--forgetting and optimism-- illustrate the advantages of adding a little more variety to the techniques of cognitive science. Forgetting The issue with forgetting is how far the principles that apply to forgetting in the laboratory generalize to other settings and other types of material. Loftus (1982) lists a number of laboratory-based generalizations that have been tented in survey settings: ~ ~ ~ we forget more as time parses, with a higher rate of forgetting shortly after we have learned the material; (2) our memories get worse an we grow older; ~ 3 ~ we are more likely to forget items in the middle of a sequence than ones at the beginning or the end; (4) we are less likely to forget something the longer we are exposed to it; and (5) we are more likely to recognize something than to recall it unaided. Coitus examines each of these generalizations in the light of results from surveys, with mixed results. The classical negatively accelerated forgetting curare, for example, does not always seem to hold; instead, people sometimes forget at a constant rate (see also Linton, 1982; Sheinwold and Tenney, 1982~. On the other hand, the relationship between exposure time and recall appears to be robust. It is unclear why personal experiences seem to follow a different forgetting curve from other memories; it raises the intriguing possibility that different types of. material may be forgotten through different processes.

93 It can be difficult to measure forgetting in a survey nettle, but it is hardly impossible. Sometimes records are available as a means of checking the respondents' recollections. Other times, tt is only possible to assess overall levels of forgetting by comparing two groups of respondents given similar recall tasks; the group which reports more has presumably remembered more. Optimism With memory research, the main problem is that the range of the stimuli is restricted. With research in other areas, the problem is that the range of the subjects is restricted. Laboratory-based research indi cates, for example, that people hold a number of optimistic beliefs: the desirability of. an event is seen as being related to its probability; bad events are seen as likelier to happen to others than to ourselves (Weinstein, 1978 ~ . Investigators have suggested a number of plausible explanations for this optimistic tendency, ranging from the idea that hope, even false hope, helps to ward off. impassivity to the notion that happy outcomes are better attended to and consequently are more likely to be recalled. Since most of the research on optimism has been conducted with Cal lege students, the question naturally arises whether these optimistic beliefs are not, in some sense, realintic--after all, college students in the United States have a fairly cushy existence, sheltered from much of life 's storm and stress. Aren't they right to believe that their futures are rosy? A related hypothesis is that optimistic beliefs are the product of the optimistic ideologies that prevail in American society. The Christian belief in personal salvation, the liberal belief in technological progress, the conservative belief in individual efficacy--these form our ideological heritage. It is difficult to get a heterogeneous enough sample of respondents to test these ideas unless one has access to a large survey sample. Are optimistic beliefs the residue of generally positive life experiences? We need to talk to the poor, the sick, the lonely to find out whether they, too, believe in their relative invulnerability to life's unpleasantness. Are individual optimistic beliefs Just an outcropping of an underlying cultural optimism? We need to talk to new arrivals, to members of. cultural outgroups, to the unassimilated to test this hypothesis. And the non-optimists will be of interest for either account . Do the depressed, the anxious, the paranoid show an unpleasant history that Justifies their pessimism about the future? Were they somehow immune to the ambient ideological optimism? It is impossible to explore these quest' ons satisfactorily without the sheer variety of people that only a large survey sample can offer.

94 References Abelson, 8. 1981 Psychological status of the script concept. A=er1 can Z~Y9h~1Q~1~L 36:715-729. Abelson, R., Aronson, E., McGuire, W., Newcomb, T., Rosenberg, M., and Tannenbaum, P. 1968 ~ . Chicago: Rand McNally. Abelson, R., Kinder, D., Peters, M., and Fiske, S. 1982 Affective and semantic components in political person perception. 42:619-630. Ajzen, I., and Fishbein, M. 1972 Attitudes and normative beliefs as factors influencing behavioral intentions. _ P~vehn1c~v 21:1-9. Anderson, J. 1976 ~~y~,_~,_~ ~~. Hilledale, N.J.: Lawrence Erlbaum Associates. Anderson, N. 1974 Cognitive algebra. In L. Berkowitz, ea., Advances in ~ Yol. 7. New York: Academic Press. 1981 ~ . New York: Academic Press. Bartlett, F. 1932 Be~r~hertn~. Cambridge, England: Cambridge University Press. Belson, W. 1968 Bem, D. 1967 Respondent understanding of survey questions. EQUALS 3:1-13. SelP-perception: an alternative explanation of cognitive dissonance phenomena. 74:~83-200. 1972 Self-perception theory. In L. Berkowitz, ea., Advances rn ~~ ~~-~i ·b~al Prvrh~ir~t, Vol. 6. New York: Academic Press. Bower, G. 1981 Mood and memory. ~ -firm P~v~-rz,~t 36:129-148. Bower, G., Black, J., and Turner, T. 1979 Scripts in text comprehension and memory. Cognitive P~vaholoev 11:177-220. Bradburn, N. 1982 Question-wording effects in surveys . In R . Hogarth, ed ., A. San Prancisco: Jossey-Bas~. Bransford, J., Barclay, J., and Franks, J. 1972 Sentence memory : a constructive vat . interpretive approach . ~v 3: 193-209.

95 Branstord , ~ ., and Johnson, M. 972 Contextual prerequistes for understanding: some investigations of comprehension and recall . J~a ~ or _ 11:717-726. Brown, R., and Rulik, J. 1977 Flashbulb memories. Alto 5:73_99. Cantor, N ., and Mischel, W. 1977 Traits as prototypes: effects on recognition memory. 35: 38-49 . Chaiken, S ., and Eagly, A . 1976 Communication modality as a determinant of message persuasiveness and message comprehensibility. Journal of ~ 34:605-614. Chapman, L ., and Chapman , J . 969 Illusory correlation as an obstacle to the use of valid psychodiagnostic signs. ~r~ 74:27 1-280 . T. Individual memories. _ 10:228-255. Collins A ., and Quillian, M. 1972 Experiments on semantic memory and language comprehension. In L. Gregg, ea., A. New York: John Wiley and Sons. Colegrove, 1898 Converse, P. 1963 Attitudes and Non-Attitudes: Continuation of a Dialogue. Paper presented at meeting of the International Congress on Psychology, Washington, D . C . Crowne, D., and Marlowe, D. 1964 ~~ ~ __. New York: Wi ley . Einhorn, H ., and Hogarth , R . 1978 Confidence in judgment: Persistence in the illusion of validity. P ~ 85:395-416. Erdelyi, M., and Rleinbard, J. 1976 Has Ebbinghau~ decayed over time? The growth of recall ~ hypermnesia ~ over days . A_ 4:275-278. Fee, J. 1979 Symbols and Attitudes: How People Think About Politics. Unpublished doctoral dissertation, University of Chicago. Fields, J., and Schuman, H. 1976 Public beliefs about the beliefs of the public. ~ : 40: 427-448. Fischoff., B. 975 Hindsight = foresight: the effect of outcome knowledge on Judgment under uncertainty. J~`rn~l or ~~.~ ~,1~ ~ 1:288-299.

96 Flavell, J., and Wellman, H. 1977 Metame~ory . In R. Rail and J . Hagen, eds, . ~ . Hilladale, N J: · ~ Lawrence Erlbaum Associates. Freedman, J. 1965 Long-term begat ioral effects of cognitive dissonance. ~ ~: _—~~ 1:145-155. Hamilton, D., and Gifford, R. 976 Illusory correlation in interpersonal perception: a cognitive basis of stereotypic Judgments. Journal of ~_5~ ~ 2: 392-407. Harris, J. 1978 External memory aids. In M. Gruneberg, P. Morris, and R. Sykes, eds., Practical, Aspects of. Memory. London: Academic Press. Hass, R., and Linder, D. 972 Counterargument availability and the effect of message structure on persuasion. Jour~ of Person~itv and, Social, £~ 23:219-233. Hastie, R. 1983 Social inference. ~C~ 34:511-542. Hastie, R., and Eumar, P. 1979 Person memory: personality traits as organizing principles in memory for behavior . ~1O" 37:25-38. Haviland, S ., and Clark, H. 1974 What's new? Acquiring new information as a process in comprehension. Behavior 13:512-521. Holmes, D. 1970 Differential change in affective intensity and the forgetting of unpleasant personal experiences. d9~1 HE ~ 15:235-239. and Goethals, G. Order effects in impression formation: attribution context and the nature of the entity. In E. Jones, R. Ranouse, H. Zelley, R. Nisbett, S. Valise, and B. Wiener, eds., ~ . Horristown, Ned.: General Learning Press. Rahneman, D., and T~ersky, A. 1971 SubJective probability: a Judgment or representativeness. L~lLl Len ~ L g~ 3:430-454 1973 On the psychology of prediction. e~ BY 80:237-251, Reenan, J., MacWhinney, B., and Mayhew, D. 1977 Pragmatics in memory: a study of natural conversation. ~ 16:549-560. Kinder, D., and Sears, D. 1981 PreJudice and politico: symbolic racism versus racial threats to tithe good lifer 2~ ~E e~ BYE 50.5.l_e 3~5~.. 40:414-431. Jones, E., 1972

97 Rintach, W., and van Dick, T. 1978 Toward a mode] of text comprehension and production. ~ 85:363-394. Langer, E . 1975 The illusion of control. ~_~ yew P~ho1 oc~ 32:31 1-328. Linton, M. 1982 Transformations of memory in everyday life. In U. Neisser, ea., Memory Observed. San Francisco: U.H. Freeman and Company. Loftus, E. 1982 Memory and its distortions. Pp. 119-154 in A.G. Rraut, ea., 5~_5~e~_~_ a_ ~ ~~ Washington, D.C.: American Psychological Association. Loftus, E., and Beach, L. 1982 Human inference and Judgment: is the glass half empty or half full? ·~mr~rd ta~ anvil 34:939-956. Loftus, E., and Marburger, W. 1983 Since the eruption of Mt. St. Helens, did anyone beat you up? Improving the accuracy of retrospective reports with landmark events. MB~ rv ~~d QA~a 11:~14-120. Loftus, E., and Palmer, J. 1974 Reconstruction of automobile destruction: an example of the interaction between language and memory. r~ ~ or ·~ t A;- ·~d lo R~hav:~- 13:585-589. Luce, R. 1959 TuA;vtu~=~ rhyme Oh. New York: John Wiley and Sons. Handler, J., and Johnson, N. 1977 Remembrance of things passed: story structure and recall. C~ Stove Fs~rbol~v 9:111-151. Markus, H. 1977 Self-schemata and processing information about the self. ~ 35:63-78. McArthur, L., and Post, D. 1977 Figural emphasis and person perception. Journal of ~~ t·~1 ·bri~1 P~v,h,1~v 13:520-535. McGuire, W. 960 A syllogistic analysis of cognitive relationships. In M. Rosenberg, C. Hovland, W. McGuire, R. Abelson, and J. Brehm, eds., _. New Haven: Yale University Press. Miller, G. 1979 Images and models, similes and metaphors. In A. Ortony, ea., i. Cambridge, England: Cambridge University Press. Miller, G ., and Johnson-Laird, P . 976 ~c~. Cambridge, Mass.: Harvard University Press.

98 Neisser, D. 1976 ~ 95~G~ -: L~. San Francisco: W.H. Freeman and Company. 1978 Memory: what are the important questions? In H. Gruneberg, P. Morris, and R. Sykes, eds., F.~rr~I ·~r~ct~ ~r 84~rv. London: Academic Press. 1982 Memory: what are the important questions? In U. Neisser, ea., More At. San Francisco: W.H. Freeman and Company. Ni~bett, R., and Ross, L. 1980 ~udecent. Englewood Cliffs, N.J.: Prentice-Hall. Nisbett, R., and Wilson, T. 1977 Telling more than we can know: verbal reports on mental process. ~~oloz~ca~ B~w 84:231-259. Orne, M. 1962 On the social psychology of the psychological experiment: with particular reference to demand characteristics and their implications. ~ ~ri~ P~ol~/& 17:776-783. Ortony, A. 1979 Beyond literal similarity. ~ ~~ai B~;~w 86:161-~80. Rokeach, M. 1971 Long-range experimental modification of values, attitudes, and behavior. ]~,~y~j~t 26:453-459. Rosch, E., and Lloyd, B. '978 Q -beam ~~: An. Hilladale, N.~.: Lawrence Erlba''m Associates. Rosenberg, M. 1956 Cognit ive structure and attitudinal affect . dour Baboon al Swami P~nIo~z 53:367-372. Rosenthal, R., and Jacobson, L. 1966 Teachers' expectancies: determinants of pupils' I.Q. gains. ~1~9]YiJ~LL~c~ 19:~15~~8 Ross, L. 1977 The intuitive psychologist and his shortcomings. In L. Berkowitz, ea., Vol. 10. New York: Academic Press. Ross, L., Greene, D., and House, P. 1978 The false consensus phenomenon: an attributional bias in self-perception and social perception process. Journal Or r~oeri~ent~1 ~oo(~1 t·xrt~ 13:279-301 Ross, L., Lepper, H., and Hubbard, M. 1975 Pernervance in self-perception and social perception. ~ 32:~80-892. Rumelhart, D. 1975 Notes on a schema for stories. In D. Bobrow and A. Collins, eds., Science. New York: Academic Preps.

99 Rumelhart, 1977 Rumelhar~c, D., and Abrahamson, A. 1973 A mode] for analogical reasoning. 5: 1-28. D ., and Ortony, A . The representation of knowledge in memory. In R. Anderson, R. Spiro, and W. Montague, eds., ~:~ ~ r ~ w. Hilledale, N. J .: Lawrence Erlbaum Associates . Sachs, J. 1967 Recognition memory for syntactic and semantic aspects of connected discourse. _ 2: 437-442. Salancik, G., and Conway, C. 1975 Attitude inferences from salient and relevant cognitive content about behavior. _ PI 32: 829-840. Schank, R ., and Abelson, R . 1 977 _. Hilledale, N . J . Lawrence Erlbaum Associates. Sheinwold, K ., and Tenney, Y . 1982 Memory for a salient childhood event. In U. Neisser, ea., I. San Francisco: W. H. Freeman and Company . Simon, H., and Stedry, A. 1969 Psychology and economics. In G. Lindzey and E. Aronson, eden ., ~ Vol . 5 . Reading , Mass.: Addison-Wesley. Snyder, M., and Swann, W. 978 Behavioral confirmation in social interaction: from social perception to social reality. Ps~l~ 14: 148-162. Taylor, S., 1980 Taylor, S ., 1975 Sykes, W. 1982 Investigation of the effects of question form. Survev e~ 9-10. and Crocker, J . Schematic baser of social information processing. In E. Higgins, P. Herman, and M. Zanna, eds., I, Vol. 1. Hill~dale, N.J.: Lawrence Erlbaum Associates . and Fiske, S . Point of view and perceptions of causality. J~n~ of 32: 439-445. Tulving, E . 1968 When is recall higher than recognition? 10:53-54. 972 Episodic and semantic memory. TO E. Tulving and W. Donaldson, eds ., ~rz. New York : Academic Press. Tulving, E ., and Thomson, I) . M. 1973 Encoding specificity and retrieval processes in episodi c memory. ~Ch~ 80: 352-373.

loo Tversky, A 1972 Elimination by aspects: a theory of choice. PsYchn Review 79:281-299. Tversky, A., and Kahneman, D. 1971 Belief in the law of small numbers. 1973 . ~ _ L"L~ 76:105_110. Availability: a heuristic for judging frequency and probability. Cognitive Psychology 5:207-232. Weinstein, N. 1978 Unrealistic optimism about future life events. 39:806-820. McNally, 1968. ~ renal of Chicago: Rand

POTENTIAL CONTRIBUTIONS OF COGNITIVE RESEARCH TO SURVEY QUESTIONNAIRE DESIGN Norman Bradburn and Catalina Danis The purpose of this paper is to review what research on cognitive processes can contribute to the understanding of errors in survey questioning. The paper is not meant to be a comprehensive review of the literature, but rather an attempt to discuss some problems that are of concern to survey researchers in the light of research results from the relevant cognitive literature. If the paper is successful, it should further a dialogue between two research traditions that potentially have much to say to one another but so far have not addressed each other. Ne start by presenting a model of response effects, ~ model of human information processing, and a discussion of the differences between research traditions in the two fields to indicate some of the difficulties in trying to bring them together. We then proceed to review some of the major response effects discussed in the survey research literature in the light of what appears to be the most relevant cognitive lit erasure. Conceptual Model of Response Effects. Our model of the surrey data collect ion process conceives of the research interview as a micro-social -system. In an idealized form, the system consists of two roles linked by the task of transmitting information from the respondent to the interviewer (and ultimately to the investigator). We distinguish three sources of variation in the quality of the data: that stemming from the characteristics of the task itself, that from the interviewer' s performance, and that from the respondent. Much of the research on response effects has focused on interviewer and respondent characteristics: for example, the race of the interviewer or the propensity of the respondents to agree to statements without regard to their content. This concentration of effort is probably misplaced because it is the task itself that gives rise to what Orne (1969) has called the "demand characteristics of the situation. n The demand characteristics, in turn, play the predominant role in determining the behavior of the actors in the situation. Thus variables affecting the characteristics of the task are at the heart of a model of response effects. Indeed, the empirical literature suggests that the character) sties of the tank are the major source of response effects and are, in general, much larger than effects due to interviewer or respondent characteristics. The task in surveys of human populations is to obtain information from a sample of respondents about their (or someone else' behavior, knowledge, or attidues. The respondent 's role is to provide that The discussion in this section is drawn from Bradburn (1983~. 101

102 information; the interviewers, to obtain the information in the manner prescribed by the investigator (who defines the task by drawing the ~ample, designing the questionnaire, and specifying the observations to be employed in the research). If respondents are to be "goods respondents, they must provide accurate and complete information. Careful attention must be given to motivating respondents to play such a role and to defining the situation for them so that they know accurately what it is that they are supposed to do. Similarly, through training, supervision, and careful specification of the questionnaire and its mode of administration, the investigator sets the standards by which interviewers will be judged on how well they have performed their role. Within this general framework, we can see that there are three sources of response effects. The first source in the respondents themselves. While we expect that most of the variance in responses among respondents comes from real differences, it is possible that there are individual differences among respondents that systematically affect their willingness or ability to give accurate responses, particularly to certain kinds of questions, such as those that might affect their self-esteem. In addition, other factors, such as the presence of other people during the interview, events that happened to the respondent before the interview began, or social pressures not to cooperate with strangers, may undermine the willingness of respondents to take the time or make the effort to be "goods respondents. The interviewer 'a role may be more or less prescribed. In some surveys, interviewers are given considerable freedom in defining the tank for themselves and for respondents, particularly with regard to the formulation of questions or the follow-up of answers to previous questions. Today, however, most large--~cale surveys use heavily structured questionnaires that leave little room for independent judgment about what questions to ask, what order to ask them in, or what to do when respondents answer one way rather than another. Interviewers, of course, do not always do what they are supposed to do, and it is impossible to anticipate everything that might happen; some things must be left to the interviewers discretion. The potential for response effects due to differences in interviewer behavior is real, even in the most tightly structured survey. The task should be defined carefully by the investigator. Task definition is primarily a matter of what questions are asked; how they are asked, that is, their form and wording; the order in which they are asked; what is said by way of introduction to the survey or to particular questions; and the mode of administration of the questionnaire. It in also the source of the largest response effects (Sudman and Bradburn, 1974). Let us look at some of the ways in which question wording and context may affect response validity. Consider the following question: "In which of these groups did your total family income, from all sources, fall last year--before taxes, that is?" [Respondent in shown a card with income categories on it from which to choose.] Several things that might bias the reporting of income occur immediately to us. First, respondents might deliberately omit some types of income that they do not want anyone to know about, such as income from

103 illegal sources or income not reported on their income tax returns. They may forget about some income or report estimates of income where good records are not readily available. A third problem may arise from misunderstanding the question or not defining terms the same way as the investigator intended . For example, should gifts, inheritances, or insurance payments be reported as income? What about nonca~h income? The question may not make clear what the investigator has in mind when asking about income. Respondents may also include income from the wrong time period. That error is in remembering when something happened, rather than whether it happened. Finally, some respondents may deliberately misreport their income to impress the interviewer or to make themselves look better off than they are or vice Terra. We can summarize these types of errors by noting that they fall into three classes: (1) deliberate errors in which the respondent adds, omits, or distorts information in order to make a good impression on the interviewer or to prevent the interviewer from finding out something about him; (2) memory errors, which may be about whether something happened or when it happened; and (3) communication errors, that is, either the investigator does not make clear to the respondent what is being asked or the respondent fails to make clear the response to the interviewer so that a wrong answer in recorded. Deliberate errors come from the respondent (and occasionally from the interviewer ~ . Memory errors come from the respondent, but their magnitude may be affected by the way in which the task is defined, that is, by the cues that are given in the questionnaire. Communication errors may come from respondents who do not make their responses clear to the interviewer, from interviewers who do not make clear what they are asking, or from the task, that is, the questions as formulated do not communicate the intended meaning to the respondents. Research on cognitive processed is most relevant to understanding memory and communication errors. We shall thus concentrate our attention in this paper on them. Conceptual Model for Human lutormation Processing There are a number of models for human information processing; they differ in their details. We shall adopt a very general model that seems to be common to almost all approaches, without attempting to suggest that any particular model is the more nearly correct one. Information enters the system Aria one or more sensory modalities. The entire input in held briefly an a sensory store from which information is selected for transfer to short-term memory (STM). The contents of the SUM are operated on as appropriate for the cognitive tasks being processed and may be encoded and transferred to long-term memory (LTM) and/or used in further processing, leading eventually to some behavioral output. Presiding over this system is come sort of central processor that performs the operations on the input, analyzes the ~ nformation, and directs the output. Schematically the system looks something like this:

104 r-------Central Processor------- $nput---Sensory store-----~Tu~ r-~----~.~^. —_~··—————~ ~~ _~ vamp For the purposes of studying response errors in surveys, the critical parts of the model are the central processor, I ~~ - - ~ interrelationships. b1M and LAM, and their The central processor is conceptualized as the part of the system that directs cognitive processes, performs the logical operations, and generally does those things we lump under the rubric of "thinking. The STM is roughly what we mean when we talk about the focus of attention or what in older terminology might be thought of as consciousness. LTM in here meant to refer to the store of past experience and learning that is somehow mentally represented no that it can be used by the central processor in thinking. Following Tulving (1972) we will distinguish two LTM subsystems-- semantic and episodic memory. Semantic memory is the system concerned with storage and utilization of knowledge about words and concepts, their properties, and their interrelationships. Episodic memory in concerned with temporally dated, spatially located, and personally experienced events or episodes and temporal-spatial relationships between such events. It may be convenient later to distinguish more than two subsystems of LTM, but two is probably the minimum. Since a survey interview consists of a sequence of questions and answers, we look to cognitive studies for understanding of comprehension of oral or written text on the input side and of information retrieval from episodic memory on the output side. There are several additional concepts that are useful for thinking about information processing related to survey questions. One of the most important of there is that memories are stored in some organized structured form and not an isolated units. Following Bartlett (1932) we call these structures schemata. Schemata are basic to the organization of both semantic and episodic memory. Words and events may be encoded in many different schemata. Stimuli that activate searches of memory are assimilated to particular schemata that direct the search within their structure. If the wrong schemata are activated, the search may be slowed considerably or fail altogether. For example, if one is involved in moving one's office and someone asks, Where did that table go?," one will begin to think about what has happened to pieces of furniture that were previously there. If, however, in the middle of a conversation a research assistant comes in looking for some computer output and asks, "Where did that table go?, thinking about pieces of furniture may fail to produce the information desired. Models of the retrieval process are generally couched within the framework of the encoding specificity principle posited by Tulving and associates (e.g., Tutoring and Thomson, 1973; Tul~ring and Wiseman, 1976~. In its strong form the principle asserts that retrieval cues will be effective at retrieval only if they provide the context at the time of encoding. This strong version has been challenged (see Burke and Light, 1981, for a discussion), but there is abundant evidence that recall performance can be improved by the reinstatement of cues at retrieval that were present at encoding.

105 A distinction is also made between recognition and recall. Recognition requires a Judgment that a particular stimulus occurred (e.g., a word was on a list), while recall requires a memory search to reproduce a particular response. Most retrieval models imply that recognition will be better (faster and more accurate) than recall because one has only to recognize the stimulus rather than to generate and then recognize it. Tulving and Thomson (1973), however, have shown that there are conditions under which recall can be better than recognition (e.g., when the cue in the recognition task activates the wrong association). In the relevant experiments subjects were shown three successive lists of word-pairs. The first two lists were designed to induce subjects to encode each target word with respect to another word. The target words, each paired with its cue word, were shown visually one at a time. Immediately after the end of the presentation of the third list the subjects were provided with 24 haphazardly ordered cue words on a recall sheet and asked to write down the target words. The cue words on the list had a weak association with the target words (e.g., ground/COLD). In a subsequent test phase, words not on the list but having a strong association to the target words (e.g., hot/COLD) were used as retrieval cues. SubJects were then given a variety of tasks that asked them to recall target words, to generate words through associations with the cue words, and to recognize words that they had generated by their associations as belonging to the target list. These experiments demonstrated conditions under which subjects could recall words but did not recognize them as belonging to the target list. The number of words that were recalled but not recognized was about 15 times the number of words that could be recognized but not recalled. These findings might have some application to understanding the differences between responses to open-ended and closed format questions. Sometimes, questions with lists (e.g., problems facing the U.S., worries) produce different response frequencies than do free-recall questions. The cognitive processes that produce these different response frequencies are not well understood. Although it is clear that much information that reaches the sensory store is lost and never enters even STM, it also appears that many things enter memory without explicit (conscious) instructions to store them. There is no clear understanding of what determines how well something will be remembered. Rehearsal, labeling, and breadth of relational networks seem to be important in influencing memory, but their relationships are not well worked out. Temporal aspects of memory are important to survey researchers because many surveys involve questions about frequency of events or events that occurred at particular times. The placement of events in time may depend on events being coded with specific markers (e.g., dates) or be inferred from trace strength (e.g., the more vivid the memory trace, the more recent the event). The growing literature on temporal memory will be discussed later.

106 Some Differences Between Survey and Cognitive Research Traditions Before proceeding to a more explicit examination of survey methodological problems from a cognitive point of view, we should note some differences in the research traditions of the two fields that make applications of the findings from one field to the other difficult. While these differ- ence~ are not hard and fact, they do represent emphases that shape the problems addressed and the methods used to study them. Survey researchers typically focus on properties of the stimulus (questions ~ that might affect the valid processing of information. These properties can be classified into three categories: (~) wording of the question; (2) structure of the response alternatives; (3) context of question, e.g., other questions asked, the instructions to the respondent, the setting in which the interview is conducted. The concern here is that the meaning of the question be the come to all respondents, and, of course, be the same meaning as that Tended by the investigator. We might rephrase this concern by maying that survey researchers are concerned that they are setting the same cognitive tank for all of the respondents. There is relatively less attention paid to whether or not a particular cognitive task is possible for the respondent or whether it is an easy or difficult tank. Cognitive researchers, however, typically focus on properties of the processing system that affect the way in which information is handled. There may also be classified into three categories: (1) encoding strategies; (2' retrieval strategies ; ~ 3) stores ~ lexical , semantic , episodic or archival). Experimental cognitive research is oriented toward studying microprocesse~ of information processing. Survey methodological research is oriented toward studying macroprocesses of comprehension and information retrieval. Much of experimental cognitive research involves laboratory tasks that are often artificial because of the need to control the effects of other variables. The subjects tend to be college students or others who have a fairly high level of education (and presumably intelligence). There is thus a question about the degree to which findings from the laboratory can be translated to the complexities of a field survey. There are a considerable number of field experiments in the survey methodological literature. The most common method is to use alternate forms of questions (split balloted with random assignment of question forms or wording to ~ /nth of the sample. These experiments typically have much larger sample sizes than the cognitive laboratory experiments and a more heterogeneous Ret of respondents on such characteristics as age and education. They have, however, very little control over (and often little knowledge about) the past learning of the respondents. Applying Cognitive Research to the Study of Response Effects Because we are reviewing sources of error in survey questioning in the light of cognitive research, we organize our discussion more in line with the questions that are most studied by survey researchers rather than the other way around.

107 Question Wording We noted above that one of the mayor sources of errors in surveys arises from imperfect communication of meaning from the interviewer to the respondent. Sometimes this miscommunication results from a failure to specify what is to be included in a particular concept (e.g., what income in to be included in a report of family income), and sometimes it arises from the imprecision of the concept itself (e.g., in questions about concepts such an liberalism, defense policy, confidence in a particular institution, etc. ). Respondents often recognize that the referent of the question is ambiguous and ask for clarification. In response to the question, "What things do you like bent about living in this neighborhood?, n a respondent might ask sensibly, "What do you mean by 'this neighborhood' ?" Unless the researcher has a specific definition in mind and supplies it to the respondents ~ respondents are usually told to define the concept for themselves . This, of course, in what people do implicitly with the concepts in all questions. The range of interpretations may be considerably greater than the investigator realizes, however, and differences in interpretation may make responses difficult to analyze. Of course, an analogous problem exists in understanding the responses given by respondents. The researcher needs to know how respondents interpreted the question in order to lmow how to interpret the answer. The comprehension process is generally conceptualized as the result of an interaction between the input material (i.e., the questions and pre- vioun knowledge. Bradford and Johnson (1972) have shown how previously incomprehensible descriptions become comprehensible when the proper context for understanding them is given. They used the following passage (P.400): The procedure is actually quite simple. First you arrange items into different groups. Of course, one pile may be sufficient depending upon how much there in to do. If you have to go somewhere else due to the lack of facilities, that is the next step; otherwise, you are pretty well ~et. It's important not to overdo things. That is, it is better to do too few things at once than too many. In the short run this may not seem important but complications can easily arise. A mistake can be expensive as well . At first, the whole procedure will seem complicated. Soon, however, it will become Just another fact of life. It in difficult to foresee any end to the necessity for this task in the immediate future, but then one can never tell. After the procedure in completed, one arranges the materials into different groups again. Then they can be put into their appropriate places. Eventually they will be used once more and the whole cycle will then have to be repeated. However, that in part of life. College students had a great deal of difficulty understanding this passage unless they were told that it was about washing clothes before they heard it and thus knew the context within which to process the information. Pichert and Anderson (1977) showed that the events that are recalled from a passage can be affected by manipulating the perspective taken by

108 the respondent. They used a passage describing two boys playing in a house. One group of respondents reads the passage from the perspective of ~ home buyer; the other group from the perspec~cive of a burglar. Each group recalled details that were appropriate to the perspective from which they were reading the passage. ~ A good example of the differences in comprehension of the same term is given by Fee (1979), who investigated the meanings of some common political terms used in public opinion studies. She adopted a variant of a technique developed by Belson (1968) in which respondents were asked to elaborate on their understanding of terms such as ibis governments or ibis business. She found that there were clusters of meanings, which gave rise to different interpretations of questions. For example, she found that the term ibis governments had four distinct images: in term of welfare, social) - , and overspending; in terms of big business and government for the wealthy; in terms of a combination of federal control and diminished states' rights; and in terms of bureaucracy and a lack of due process. Different kinds of people tended to hold different images, which in turn were related to different attitudes. Unfortunately there did not seem Deco be any way of determining in advance how a respondent would interpret the term. Different Qugations on the Same Tggic In the early 1950s, both Gallup and the National Opinion Research Center (NORC) asked questions about support for the Korean War. The Gallup question was: ado you think the United States made a mistake in deciding to defend Korea, or note The NORC question was: Ado you think the United States was right or wrong in sending American troops to stop the Communist invasion of South Korean The NORC question consistently drew more support for the war than the Gallup question. There are actually three important differences between these two ques- tions that may have affected the responses to them. The one to which most attention hen been paid is the addition of a reference to stopping the Communist invasion. Researchers have generally round that there is greater approval for American foreign policy decisions when the decision is described as Reattempting to stop Communism.. A second difference is the use of the terms Bright or wrongs in the NORC question and the use of the term ~mistake. (nor noted in the Gallup question. We do not know whether Wrong is conceptually equivalent to ~mistake,. particularly when it is paired with ~right.. Finally the Gallup question that the policy in proposition in the __ In is asked in a form that implicitly a mistake, that is, the respondent has to main part of the question in order to support has to choose between twc presumer deny the the war. In the NORC question, the respondent has to choose between two alternatives, one positive and one negative. What can we say about these differences from the cognitive processing point of view? Regarding the use of politically sensitive terms we might hypothesize that the evocation of, a new element (reference to stopping Communist invasion) causes more respondents to assimilate the question

109 Coemunis~c invasion) causes more responden~cs So assimilato the question partially to the sch--m~ Copping Communi - . than would be the case without the explicit reference. If the schema estopping Communism" is highly approved, then we would expect that more approval would be expressed when the phrase is explicitly part of the question than when it is not. One might also ask whether the term Defend is assimilated to the same schema as Upending American troops?. In order to understand fully the effect of the different wordings, one would need to know the different schemata involved in processing the text of the questions and the underlying evaluation of the different schemata. Schuman and Presser (1981) present further evidence of wording changes that change the apparent meaning of otherwise similar questions. The problem of synonyms is a difficult one for question wording. Enough is known from methodological studies of question wording to know that the path is fraught with booby traps (Payne, 1951 ) . Similar, if not exactly synonomous, terms that indicate a postive orientation toward an attitude obJect can have different connotations and yield different responses. For example, the terms "approve" and ~di~approve, ~ and ~like" and "dislike" are frequently used in attitude questions, although little attention has been paid to different implications of the two terms. At least one use of the two terms suggested that they are not exactly the same (Murray et al., 1974~. Choosing between two alternatives appears to be a more difficult cog- nitive task than simply affirming or denying the truth or a proposition. Thus slower and perhaps more thoughtful answers may be given when the re- ~pondent has to choose between two or more alternatives. Also the constraints of the interview situation may tend to make it preferable for the respondent to agree rather than disagree with a statement, particularly if it is a proposition that they do not have strong (or any) opinions about. Systematic research on these aspects of question wording could be fruitful. Antonyms are also troublesome. Approval of a positively worded attitudinal statement is frequently not the same as disapproval of the same attitude when it is expressed negatively using an antonym of the pos- ti~re word)=. One of the best-known examples is that given by Rugg (l941~: "Do you think the United States should allow public speeches against democracy?n Loo you think the United States should forbid public speeches against democracy?" To the first question, 21 percent responded that speeches against democracy should be allowed; 39 percent said that such speeches should snot be forbidden. n Similarly, 62 percent said that such speeches should snot be allow, but only 46 percent said that they should be forbidden. (Other respondents had no opinion on the matter.) Clearly the concepts ~allow" and "forbid. have somewhat different connotations so that their negatives are not equivalent. He need to have more systematic research about how the processing of. the negative form of statements differs from the processing of the positive form and how nega- tions differ from statements using an antonym.

11o ]~ 5~L~IJ~L=52 ~1~1~ Qusstion8 may differ on a generality, specificity dimension, and the degree of specificity contained in the question wording appears to affect responses. An example of the effect of increasing specificity can be seen in questions from a Gallup survey in May-June 1945: Do you think the government should give money to workers who are unemployed for a limited length of time until they can find another job? [yes, 63 percent; no, 32 percent; don't know, 5 percent] It has been proposed that unemployed workers with dependents be given up to $25 per week by the government for as many as 26 weeks during one year while they are out of work and looking for a Job. Do you favor or oppose this plan? [yes, 46 percent; no, 42 percent; don't know, 12 percent] Would you be willing to pay higher taxes to give unemployed persons up to $25 a week for twenty-six weeks if they fail to find satisfactory Jobs? [yes, 34 percent; no, 54 percent; don't know, 12 percent] Since these questions do not differ in single elements but change sev- eral levels of specification in each form, it is impossible to say what elements were causing the changes in support. The key elements in these questions appear to be the referent for support (unemployed versus unemployed winch dependents); the amount of time for support (a limited length of time versus 26 weekly; the amount of support (an unspecified amount of money Versus $25 per week); the stopping rule (finding another Job versus finding a satisfactory '3ob); and the mode of financing the payments ~ not specified versus higher taxes ~ . A systematic investigation of different elements such an these in typically not done in surveys. Even with the same question wording, respondents may vary in the degree of generality with which they interpret questions. For example, Smith (1980) reports the following results from the 1973 and 1975 General Social Survey (GSS). Respondents were first asked: Mare there any situations that you can imagine in which you would approve of a man punching an adult male stranger?. Respondents were than asked a series of questions about specific conditions under which they would approve of such an action, such as the stranger hit the man's child after the child had accidently damaged the stranger's car or the stranger was beating up a woman and the man Raw it. Many respondents who said that they would not approve of hitting a stranger "under any situations that they could imagines went on Deco express approval for hitting when specific situations were described. Smith suggests that many respondents are not interpreting the general question as literally asked but responding instead to the absolute phrase Hare there any situations that you can imagine" as if it meant "in situations that you can easily think off or simply "in general." The question an stated poses a difficult cognitive task for respondents since it is asking for them to imagine a very large range of

111 situations. Typically the flow of ques~cioning does not give respondents very much bike to Chink about the question, so, as Smith suggests, the effect is to limit responses to the situations that can easily be thought of. lathe task is also one that is not engaged in every day, so we would expect that the effect would be more pronounced among those who are less Imaginative, whatever that might be correlated with, perhaps age or education. The specifications ease the cognitive task by providing what imagination was not able to do, that is, some specific situations in which hitting a stranger might be approved. One way to ease the cognitive task and perhaps equalize the task more nearly for all respondents would be to give a longer introduction to the topic. Such an introduction might cite some of the more corn examples where people are apt to approve, or at least tolerate, hitting a stranger. This is also an example of the type of question that might display a substantial order effect depending on whether the general question is asked before or after the more specific questions . ~ lrhe topic of order effects is discussed below under the section on context effects.) The mention of specific situations might be thought of as an example of prompts in al d of memory. In questions that require a fairly substantial amount of effort to answer completely, prompts in the form of examples of. category instances are often given. For example, in the GSS the question on membership in voluntary organizations ts asked as follows: snow we would like to know something about the groups and organizations to which individuals belong. Here is a list of various kinds of organizations. Could you tell me whether or not you are a member of each typed Respondents are then handed a card with 16 types of organizations on it, and they are asked about membership in each type of organization. Even this type of prompt may rail to elicit good recall since the respondents still must have the particular organizations that they are members of coded into the categories presented, and the cue of the category must trigger a search that will retrieve the fact that they are members. This type of prompting is called aided recall by survey researchers, and it produces a higher level Or response to recall questions than forms of the question without such aids. The U8Q 0t aided recall has an unfor- Lunate side effect, however, for items involving time-bounded phenomena, e.g., magazines read in the last month, number of visits to the doctor in the last three months, etc.: it increases Scoping, that is, reporting instances of the event from the wrong time period. (Problems related to memory for the timing of events are discussed below.) _ I_ ~ The processing of information on the part of respondents involves two levels of processing. First, the ques- tion must be processed according to the rules that we use to underacted spoken or written language, that in, the question must be understood. Second, the representation of the question must be processed according to other rules (e.g., Kintsch and van Dill', 197B, macrostructures) that retrieve the information necessary deco answer the question and perform the operations necessary to make the judgments and produce the answer to the question.

112 The wording of the question provides the stimulus that -bets off a com- plex set of cognitive processes which finally result in a response. Tuiving and Thomson (1973:352) note: "Betrieval operations complete the act of remembering that begins with encoding of information about an event into the memory store. Thus, remembering is regarded as a JoinS product of information stored in the past and information present in She immediate cognitive environment of the rememberer." They go on to develop the encoding specificity principle (p. 353~: What is stored is determined by what is perceived and how it is coded, and what is stored determines what retrieval cues are effective in providing access to what is stored. n At first look, the encoding specificity principle suggests that we will have great trouble in using standardized questions in surveys since it implies that there could be great individual variation in the way people encode events. Without denying that there may be some individual variation, ~ t is likely that respondents from a similar culture and who speak the same language will encode most everyday events in much the Game way. If this were not the care, we would not be able to communicate with one another as well as we do. The encoding specificity principle is useful in reminding us to pay more attention to the ways in which events are encoded and to develop the wording of questions TO as to match those codes. For example, significant subgroups in the population (e.g., those identified by ethnic group, region, education ~ may have different ways of encoding some types of events so that we might have to alter question wording for different groups. Bradburn et al . ~ ~ 979 ~ did find somewhat higher (and probably more accurate ~ reports of use of substances such as drugs and alcohol when respondents were asked questions phrased in their own terms. The encoding specificity principle may be most important in relation to attitude questions for which the concepts are less well defined and have fewer shared behavioral referents. It suggests that we need to make much greater efforts to map the ways in which respondents encode various attitude objects that we are studying, particularly if they are to be studied over some period of time. Since we frequently want to study change in attitudes over some period of time, it is extremely important that we understand better how to preserve the meaning of questions as events change. ~~a ~ For questions that require recall of past events, question wording may affect responses by directing attention toward or away from the mayor retrieval categories that are of interest in the question. For example, one of the purposes of an experiment by Vogel ~ ~ 974 ), as part of a survey of Wyoming farmers in a Department of Agriculture study, was to study the effect of question wording on estimates of the calf crop on Wyoming cattle ranches in the year 1974. There was reason to believe that the number of calves being raised of less than 500 pounds was underreported. Two forms of the question on calves was used. The question asking about calves weighing less than 500 pounds was changed in the experimental version to emphasize the world "calves" because livestock producers often did not report unweaned calves since they considered the cow and the calf to be one animal unit before

1t3 weaning. The question was part of an inventory question whose introduction read: Please report below all cattle and calves on the land you operate, regardless of ownership (include those now on feed). Also include those owned by this tarot or ranch that are now on public grazing land. How many are: 3. beef cows? (Include heifers that have calved) 4. milk cows? (Include milk heifers that have calved) c c c ~ ~ ~ ~ ~ B. HEIFER, STEER, and BULL CALVES weighing less than 500 pounds. The alternate version was: 8. CALVES--heifer, steer, and bull calves weighing less than 500 pound s. In the experimental version the superordinate category of interest ~ calves ~ was placed first with the subordinate categories (helter, steer, and bull calves) given less emphasis by being placed later in smaller type. The experimental Version produced about a ~ O percent higher estimate for calves weighing less than 500 pounds. A further complication to the recall process is that remembering is believed to be a reconstructive process. Loftus and her associates (Loftus, 1977, 1980; Loftus and Palmer, 1974) have shown that memory for events can be supplemented and altered through the introduction of post-e~rent information. The incorporation of post-event information can be attenuated to the extent that respondents are more certain about particular memories it recall has been made prior to the introduction of the additional information and if the source of the information is clearly suspect (Loftus, 1982~. Response Categories One of the oldest and most puzzling phenomena in survey research concerns the differences between open- and closed-ended questions. In the open-ended question respondents are given no response categories and simply answer the questions using their own terminology. Closed-ended questions give the respondents a -~et of alternatives from which to choose answers. The open and closed format difference is similar to the recognition/ recall distinction in cognitive research. In the open format the respondent has to retrieve from memory the material that is asked for in the question. Consider a question such as, What are the most important problems racing the country today?. asked in an open-ended format, such a question requires a considerable amount or cognitive work on the part of the respondent to define the limits of the concepts being called razor in the question, then initiate a memory search for instances of these

114 concepts, f ina, ~ y producing the responses . With a precoded question the categories of interest are much more delimited by the response categories offered. The respondent simply has to recognize which particular category in a true instance of behavior or attitude. As with the recognition task, precoded questions appear to be easier for respondents to handle, and in many instances (but not all) they produce as good if not better responses than do open questions. Aided recall generally produces fuller and more accurate responses than do questions without aids, assuming, of course, that the aids are adequate for the recall tack. Depending upon the type of list used for aids, the task may be thought of as a pure recognition task or as a recall task with many cues to aid the memory search. There is one type of behavior report question in which it is clear that the open-ended format does much better than preceded response category es . An experiment by Bradburn et al . ~ ~ 979 ~ compared the effects of preceded response categories and verbatim recording on behavior reports about topics for which there is a presumption of considerable underreporting, such an alcohol consumption. An example of the type of question used involved those who reported drinking beer during the previous year: "On the average about how many bottles, cans, or glasses of beer did you drink at one tame?" In the preceded version the codes were n 1 ~ ~ n2' n n3' ~ nil, n ns' n n6 or more.n In the open-ended version there were no codes; the interviewer simply wrote down the number given by the respondent. Estimates of beer consumption based on open-ended questions were about 60 percent higher than those based on the preceded responses. The reasons for this difference are not entirely clear. It is probable that the distribution of alcohol consumption has a long tail on the high side; thus those who report more consumption than allowed for by the highest preceded category tend to increase the average. It is also possible that the precedes are interpreted as an implicit norm and that Rome high consumers are reluctant to place themselves in the highest category, particularly if they did not consider themselves to be heavy drinkers. With the type of questions studied by Bradburn and associates in which open-ended questions produced better responses than preceded questions, the questions were quite specific in delimiting the categories of recall and the responses were estimates of quantity or frequency. Thus the response dimension was well specified even though not explicitly given by response categories. More equivocal results have been found in a series of experiments conducted by Schuman and Presser (1981~. They studied questions that involved multiple nominal responses to broad inquiries about values and problems, for example: That do you think is the most important problem facing this country at present?n nPeople look for different things in a job; what would you most prefer in a Job? nWhile we're talking about children, would you please say what you think the most important thing for children to learn to prepare them for life. n The results from their series of experiments were complex, but it was clear that there were statistically significant, substantively important differences in marginal distributions between open and closed forms of this type of question. Schuman and Presser also demonstrated that the assumption of

115 form-resistant correlations, that is, that even though marginal distributions may change between question type, the correlations between responses and significant independent variables would be constant, is not tenable with respect to the open- and closed-ended question comparisons. For example, the correlation between education and work values such an high pay and steady pay varied by the form of the question used. Comparing responses to open and closed forms may give us Rome insights into the cognitive processed at work in the two forms of question . A fairly consistent finding, at least for the broad attitudinal questions that Schuman and Presser studied, is that respondents give a greater range of answers in response to the open form than they do when they are given a list of precodes, even when there is a category for "others responses. Respondents rarely use the Bothers category in closed forms. Thus it would appear that one of the important effects of providing response categories is to define much more narrowly the range of categories that in to be included in the response to the question. It is also likely that there may be some unintended direction in the open forms that can be corrected by the provision of specific alternatives . For example, Schuman and Presser found that the response category "crime" was much more frequently reported an being one of the most important problems facing the country when the question was a.sked in a closed form rather than an open form. They hypothesized that the reference in the open question to "facing thin countryn discouraged respondents from thinking of crime as a problem, because crime is perceived by many an a local rather than a national problem. Providing ~crime" as an explicit category in the closed form made it clear that the investigator intended to consider it a national problem. That open versions of questions are cognitively more difficult for respondents is suggested by the fact that in the Schuman and Presser experiments the more educated as compared with less educated respondents gave more responses to the open-ended questions and that there were more missing data in the open forms among the less- an compared to the well- educated. Such differences, however, are less likely to occur when the open questions are extremely focused and require only a simple numerical answer. Another issue of considerable importance and controversy in attitude measurement is the offering of an explicit "don't know" or No opinion" category to the respondent. Research on this issue (See Schuman and Presser, 1981:Chapter 4) indicates that providing an explicit "don't know" (DK) option--called a filter--substantially increases the proportion of respondents who give DR responses. Typically, the DO increment is around 20 percent, but this appears to be unrelated to question content or the DO level that prevailed in versions without an explicit DE filter Schuman and Presser introduced the term "floaters to refer to persons who give a substantive response to an item in a standard unfiltered form and a DK response to a filtered version of the Dame question. They suggest two models that might account for the nature and identity of floaters. One is a trait model that conceives of floaters as a distinct group who have a high propensity to say "don't known when offered this choice on a filtered question but will not volunteer such a

116 response in the absence of an explicit alternative. The second model, which they call Threshold process model, ~ suggests that there is a process of floating that in created by the form of the question for any given item. This~model suggests that there are a number of variables that may influence a respondent's position on a DO propensity dimension for particular item contents. The actual frequency of DK responses depends in part on respondent' position on this propensity dimension and part on the height of the barrier created by the question form. As Schuman and Presser point out ~ ~ 981: ~ 46 ): Where the question form strongly discourages DK responses only those very high on the dimension will insist on giving a DE response. Where the question form facilitates or even encourages DK responses (e.g., by providing a full DK filter preliminary to the opinion question) those with both high and moderate positions on the dimension will choose the DO option. In both instances the underlying dimension in the same, but the question form creates different cutting points on it. Moreover, the difficulty of the question content Create Versus familiar ~ agues ~ will also greatly affect cutting points on the dimen- sion, though, according to the model, equally for both standard and filtered question forms. Thus the DK propensity dimension influences the giving of DK responses on both forms, but there is no special trait that distinguishes floaters as such, nor is there a special group of people to be set apart and described by thin term. Additional questions of interest to survey researchers concern the effects of different response categories, such an the use of middle alternatives as opposed to dichotomous agree/disagree or approve/ disapprove type questions, the effects of using differing numbers of points on rating scales, and the use of familiar analogies such as ladders or thermometers for giving ratings of intensity of [avorableness or unfavorablene~ toward specified attitude ob jects . Contextual Meaning Questions in surveys are asked in some context. Part of this context is set by the introductory material that the interviewer presents in order to gain the respondent's cooperation. The order in which questions are asked provides a context that will affect the interpretation of the question or provide different cues for retrieval processes. Que~tion-order effects have been widely studied among survey researchers, although until recently there has been relatively little theoretical orientation. A fairly typical example of the types of effects round was reported by Noelle-Neumann (1970), who examined the designation of various foods as particularly "German.n This study was part of an exploration of the image of three basic food~tutts. In one form of the questionnaire respondents were asked first about potatoes, then about rice; in another form of the same questionnaire this order was reversed. When respondents were asked about potatoes first, 30 percent said potatoes were particularly "German. However, when respondents were

117 asked about rice first, 48 percent said that potatoes were particularly nGerman.~ A similar order effect was found for the pair, ~noodies/rice. Findings of this sort suggest that the initial general question, e.g., Hare the following foods particularly German?,. activates a genera, schema relative to the "&ermanness~ of foods. However, this schema is relatively vague in the beginning and becomes sharpened an different exam- p, es are presented. When examples that are particularly atypical (rice) come first, the various attributes of the schema become clearer--a kind of. contrast phenomenon. A persistent mystery is that order effect tend to be asymmetric, that is, they only affect one question or response rather than both of the questions that are rotated. In the above example, order affected the proportion reporting potatoes an "Germane but not the proportion reporting rice as ~German. n More recently it has been observed that when a general question and a related more specific question are asked together, the general question can be affected by position while the more specific question in not. For example, in the 1980 GSS, the following two questions were asked: Taking all things together, how would you describe your marriage? Would you say that your marriage is very happy, pretty happy, or not too happy? Taken all together, how would you say things are these days--would you say you were very happy, pretty happy, not toe, happy? The order of these two questions was rotated in split-ballot experi- ments. The results indicated that responses to the general question about overall happiness were affected by the order in which the questions were asked, while the question on marriage happiness was not affected by the order . The direction of the shift, however, was not the same in different experiments (see Schuman and Presser, 1981 : Chapter 2) . One explanation in that when the general question comes first, the question is in fact viewed as including one's whole life, including marriage. However, when the specific question comes first the overall happiness question may be interpreted in a more narrow sense, referring to all aspects of. life other than marriage. Somewhat similar results have been reported by Schuman and Presser ~ ~ 98 ~ ~ for general and specific attitude items related to abortion . Here higher levels of support were found for a general question about abortion than when the general question came before a series of specific questions about approval of abortion in particular situations, such as in cases of rape or damage to the mother's health. When respondents answer the general question first, it is clear that subsequent specific questions are subsets of the general one; answers to the specific questions may well be different from answers to the general ones. They can be interpreted in their specific sense, independent of what has gone on before them. The general question, however, may be interpreted an really general, i. e., including all of the specific items that are subsets of the general attitude, or it may be interpreted as all the other things

118 not included in the previously asked specific questions. Unfortunately we do not yet know the range of phenomena to which this type of order effects apply. With regard to reports of behavior or events, order can also affect responses not only by creating the context for interpretation of the question but also by providing greater cues or time for retrieval processes to take place. In the Wyoming cattle and calf survey mentioned previously (Vogel, 1974), two question orders were used in order to estimate the number of calves born on the land the ranchers operated. In one form of the questionnaire the total number of calves born during the year 1974 was asked before a series of detailed questions asking how many of these calves were still on the farm or ranch and how many had been sold or slaughtered or how many had died. Den these detailed questions were reversed and the total was derived by adding up the categories, the estimates for total calves born during the year was about lO percent higher than when the question on total number was asked first. Similar results occur when respondents are asked a detailed set of questions about whether they get income from different specific sources, such an wages and salaries, Pairings accounts, transfer payments, etc., before they are asked their total income. The detailed questions appear to act as reminders and facilitate the memory search in order to come up with the requested information. Marquis et al . ~ ~ 972 ~ and Bradburn et al . ~ ~ 979 ~ have found that lengthening the introduction to questions improves reporting of such things as symptoms, utilization of health care facilities, and reports of alcohol and drug consumption. It seems likely that the longer introductions not only direct respondents' attention toward the information requested and start the search process but also give the responent more time to do the actual retrieval operations. Unfortunately for survey practice, many of the recommendations based on research on retrieval processes point in directions that are antithetical to other pressures in conducting surveys. Techniques that improve recall, such as more detailed questions, longer questions, and giving the respondent more time to think about answers, also increase the length of the interview and thus costs. There is also concern on the part of the U.S. Office of Management and Budget, which must grant clearance for surveys conducted under government contract, that respondents not be overly burdened by surveys. Respondent burden has typically been defined in terns of length of interview without consideration for difficulty of the tanks. The setting within which the interview is conducted may also create a context that facilitates or inhibits accurate reporting. The presence of others is the desired case for the National Health Inter~ric~ Survey, because they may produce interactions that stimulate memory and produce overall better reporting. For the reporting of some types of behavior, however (either sensitive or embarrassing behavior, for instance), the presence of others may nerve as an inhibitory factor and reduce reporting. In their review of methodological studies on response effects, Sudden and Bradburn (1974) found a slight overall negative response effect (that is, a net underreporting) for surveys in which another adult was present

119 during the interview. In a fragmentary finding, Bradburn et al . ( 1979 ) found that the presence of others did not generally affect level of reporting, but that the presence of children seemed to make respondents uneasy about discussing sensitive behaviors such as drug and alcohol consumption and sexual behavior. The presence of adult third parties seemed to stimulate higher item-refusal rates. Time and Frequency In surveys, one of the most important tasks facing the respondent is recalling the correct time period in which the behavior in question occurred. Errors that occur when respondents misremember the time period during which some event occurred are called telescoping errors. While one might imagine that errors would be systematically distributed around the true time period, the errors are more in the direction of remembering an event as having occurred more recently than it did. Thus the net effect of telescoping is to increase the total number of events reported in the more recent time period. This is partly because factors that affect omissions, such as the salience of events, also affect the perception of time (Brown et al., 1983~. The more salient (or more frequent) the event, the more recent it appears to be. The result is that there will be overreporting of more salient or more frequent events. Some of the best studies of telescoping effects on memory have been done by Neter and Waksberg (1964~. They developed a procedure, called "bounded recall, n for reducing the effects of telescoping. Bounded recall procedures require panel designs in which respondents are interviewed several times. At the beginning of the second or later interviews ~ bounded interviews ), respondents are reminded of what they said about their behavior during the previous interview and then asked about additional behavior since that time. For example, in the initial interview, respondents may be asked about the number of visits to physicians in the last three months. At the second interview, three months later, respondents review their responses from the previous interview and then are asked, "Since we lent talked, how many times have you seen a physician?" The bounded-recall procedure requires a considerable amount of control over the data in order to provide interviewers with the correct information from the previous interview and thus has not been used as often as it deserves to be. Respondents in surveys are not only asked to report on their behavior or attitudes, but they are also asked to make judgments about how often something has happened or how frequently they fee] some way or did something. Some research has been done on the accuracy of time reporting, but most attention has been focused on increasing the validity of behavioral reports for the number of events that occur within a particular time period, e.g., how many times did you visit a doctor during the past 90 days? Relatively little attention has been paid to the cognitive problems of reporting frequency of subjective states, e.g., how often during the past few weeks did you feel bored?

120 _ ~~ ~ The general consensus among cognitive psychologists in that people are quite good at making relative frequency `Judgments concerning a variety of events for which frequency information wan acquired under incidental learning conditions, that is, under experimental conditions in which subjects are not explicitly asked to attend to the frequency of events. Studies that have addressed the issue of people ~ ~ abilities in the domain of frequency Judgments have examined two types of events: ~ ~ ~ experimentally controlled events that occur from one to ten times, which are here be referred to as Experimental--low frequency events (E-LF); and ~ 2 ~ naturally occurring events that are generally of high frequency ~ e . g ., the letter "a" in spoken English ~ , which are here referred to as "naturalistic-high frequency" events (N-HF). For E-LF events, judged frequency increases with actual observed frequency (Hasher and Chromiak, 1977; Hintzman, 1969; Hintzman and Stern, 1978; Johnson et al., 1977; Warren and Mitchell, 1980~. Most of these studies used printed or spoken English words as stimuli, although pictures have also been used and give similar results. The same basic result is found with N-HF events. I t has been argued that the encoding and updating of information concerning the occurrence of events is obligatory and automatic (e.g., Hasher and Zacks, 1979~. Some evidence for thin claim is provided by the fact that there in a high positive correlation between people 'a Judgments and the actual occurence of a variety of naturally occurring seventh: n letter frequency (Attneave, 1953 ), words (Shapiro, ~ 969 ~ and lethal events (Lichtenstein et al., 1973~. More direct experimental evidence on this topic comes from studies that have compared the reporting of frequency information acquired under incidental and explicit learning conditions (Howell, 1973; Attig and Hasher, 1980; Kausler et al., 1981~. In general these studies show no or only slight effects of explicit instructions to attend to frequency judgments. The automatic nature of frequency encoding also apparently extends to the activation of higher order information. Alba et al . ~ ~ 980 ~ presented sub jects with lists of words that were chosen so as to represent 3, 6, or 9 instances of various semantic categories. They found that subjects were able to judge categorical frequencies in a surprise posttest : Judgments of category frequency increased as actual frequency increased, and mean judgments for items occurring with different frequencies were significantly different from each other. It has further been argued that the ability to make frequency judgments does not involve a learned component. Developmental studies that compare the performance of children an young an kindergarten age with college-age adults generally do not find evidence for an age-by- frequency-of-presentation interaction (Hasher and Chromiak, 1977; Hasher and Zacks, 1979~. This seems to indicate that the frequency-of- repetition variable affects frequency Judgments during childhood and adulthood in similar ways. Age differences in the absolute frequency of responses have been reported (main effects only), but these are not monotonic with age (Hasher and Chromiak, 1977~. There is a limited amount of evidence that the ability to make frequency judgments declines with age. In a comparison of older adults (60-75 years of age) with young adults ( 18-30 years of age), Warren and

121 Mitchell ~ 1980 ~ found an age-by-frequency-of-presentation interaction for absolute frequency Judgments. This interaction was due to a slight loss of discrimination among items of differing frequency for the older group. Age interactions, however, have not been found when the task has been one of choosing the more frequent of two items (Attig and Hatcher, 1980; Baubles et al., 1981~. In addition, practice effects have not been found for subjects making repeated frequency judgments (Erlich, 1964; Hasher and Chromiak, 1977~. This also suggests that learning does not play an important role in fre- quency judgments. _~ AbiliLv to Ike_ A number of studies have produced results that show that some restrictions exist on people's ability to make frequency Judgment. In the first place it is generally known that people are not very accurate in their absolute frequency Judgments. The typical result is that for E-LF events people tend to overestimate low frequency es and underestimate high frequencies ~ Hasher and Chromiak, ~ 977; Hintzman, 1969~. There is some evidence that whether an event occurring with a particular frequency will be overestimated or underestimated depends on the frequency with which other events under consideration have occurred. That is, subjects seem to use some knowledge of the average frequency of events in the experiment . For example, Hintzman ~ 1969 ~ reports that in one experiment in which items were presented with frequencies ranging from one to ten times, subjects overestimated the frequency of items that occurred twice . In another experiment, however, where the highest frequency of occurrence was two, sub jects tended to underestimate the frequency of items occurring twice. The underestimation of low-frequency events and the overestimation of high-frequency events is also round with N-HF events (Attneave, 1953; Lic},ten~tein et al., 1978; Shapiro, 1969~. Attneave (1953) asked subjects to judge the frequency of letters occurring in written language and gave them a standard of comparison of the occurrence of the average letter per 1, 000 letters. Lichtenstein et al . ~ 1970 ~ provi deaf their subjects with a standard of comparison of 50, 000 deaths per year due to motor vehicle accidents or of 1, 000 deaths per year due to electrocution. Shapiro ~ t969), however, did not provide his subjects with anchor points. It is not clear what mechanism in responsible for the over- and underestimation. One possi blity is that on some percent of trials subjects do not have adequate frequency information and that on those trials they adopt a strategy of using the mean frequency. This could account for the opposite biases on the two ends of the scale, provided that sub j ects are aware of an average frequency. Subjects also tend to be more accurate at estimating low-frequency items than at estimating high-frequency items (Alba et al., t980; Hash er and Chromiak, 1977), and the variability of responses tends to be much higher for higher-frequency items. The absolute size and accuracy of frequency Judgments has been found to be affected by the spacing of repeated occurrences of an event (i.e., a function of the number of intervening events between repetitions of an

122 event). The general finding is that for E-LF events increasing the interval between successive presentations leads to higher and more accurate frequency judgments (Hintzman, 1969; Jacoby, 1972; Rose, 1980; Underwood, ~ 969~. The context in which an event is presented also affects the absolute size of frequency judgments but not the discrim~nability of events occurring with different frequencies. The general finding is that for items of equal frequency, if repeated items are placed in a different context on each occurence, then the judged frequency will be lower than if the context is the came on each occasion. For example, Hintzman and Stern (1978) presented names of famous people to be learned in the context of a descriptive statement of the person that was either the same or different on all occasions and had subjects make a truth value judgment about the statements; they found higher reported frequency for low-variability contexts. Jacoby (1972) and Rose (1980) report a similar finding. However, Rome (1980) finds equal discriminability [or items of differing frequency under the two context conditions. It is also possible to affect the absolute size of one's frequency judgment by generating the item silently or explicitly by writing or speaking it ( Johnson et al ., 1 977 ; Johnson et al ., 1 979a, 1 979b ~ . The basic design of these studies included targets presented a variable number of times in the context of the same paired associate. Following a learning trial, some of the targets were tested by presenting the paired associate cue and having subjects write down the target. (Most subjects could do this on most trials.) The basic result is that tent frequency (i.e., occasions for Elf-generated occurrence of an event) affects judgments of external event frequency and vice Vera. This has been replicated for two other college-age sampler by Johnson and associates and extended to 8- to 12-year-old children (Johnson et al., 1977; Johnson et al., 1979a, 1979b). Lichtenstein et al . ~ ~ 978 ~ tried to determine the cause of the systematic overestimation of the frequency of occurrence of some lethal events. They found two Variables that account for most of the variability in their sub Sects ' responses : ~ ~ ~ the frequency with which they report learning about the event as a cause of suffering through the media and ~ 2 ~ the frequency with which they report having the death of a close friend or relative caused by the event. Both the Johnson et al . studies ~ 1 977, ~ 979a, ~ 979b ~ and the Lichtenstein et al. ~1978) results may be explainable by the availability heuristic of Tversky and Kahneman ~ ~ 973 ~ . This states that frequency judgments may under certain circumstances (see below) be based on how easily instances of the event become available at the time of making the frequency Judgment. To the extent that thinking or hearing about an event or having some personal involvement with an event increases its saliency, such conditions may make information more available, and therefore appear more frequent. A study by Rowe ~ ~ 974) suggests that the type or degree of processing given to targets in an incidental learning condition may also affect the absolute value of frequency judgments . Rowe had sub jects either process items semantically (e.g., rate each target on how strongly it connotes strength ~ or nonsemantically ~ e .g., determine the number of syllables in

123 a target ) and found that semantic processing resulted in higher frequency responses. It is tempting Deco speculate that this effect may also be due to semantically processed items being made more salient to the subject and therefore, by the logic of the Tversky and Rahneman (1973) argument, being judged an occurring more frequently than less salient items. ~ A number of the factors affecting frequency Judgments may lead, under some circumstances, to an incorrect assessment of the occurrence of behavior by the survey researcher. Age differences in the absolute size of response (e.g., Hasher and Chromiak, 1977; Warren and Mitchell, 1980), whether they are attributable to response biases (Hasher and Chromiak, ~ 977 ~ or to differences in the discrimination of frequencies (Warren and Mitchell, 1980), must be considered by the researcher investigating a behavior over a wide age span. The systematic under- and overestimation of events occurring with high and low frequency, respectively (e.g., Hintzman, 1969), can also be a problem it one is interested in obtaining data on the absolute rather than the relative frequency of. occurrence of events. This will be true to the extent that events of different frequencies are compared, an for example, comparisons between physician visits reported by older and younger respondents who have very different frequencies of doctor visits or for visits to dentists as compared with visits to psychiatrists. It is not clear whether a remedy exists for this systematic di~tor- tion of frequency judgments. The fact that this distortion is found under natural learning conditions when no reference point ~ ~ given (Shapiro, 1969) suggests that this may be the result of the very process by which people make such judgments. It is interesting to note that an analogous distortion is found in the domai n of time dating: Brown et al. ~ 1983) found that subjects asked to place events on a bounded time scale tend to bring forward in time older events and move back in time more recent events, a phenomenon they have dubbed the squish effect. They have further found that this effect can be reduced, but not eliminated, by extending the boundary at the far end of the scale ~ i . e., the earlier time boundary ~ . Brown et al . ~ ~ 983 ~ also show that it in difficult for sub Sects to determine when salient, public events have taken place. Loftus and Marburger ~ ~ 983 ~ have undertaken some research to attempt to ameliorate this serious problem. They find that simply asking ~ub'3ect~ to search memory for events occurring between two specific dates provides more accurate responding than using a general dating procedure such as "in the last six months ...." A more important contribution of the Loftus and Marburger ~1 983 ) paper is their use of landmarks , public or private, to mark the boundaries of the recall period. They found that the use of a public event, such as the eruption of Mt . St . Helens or New Year ' s Day, or a private event, such an the respondent ' ~ birthday, to mark the beginning of the recall period resulted in more accurate frequency reports than even the use of specific dater. It appears that landmark events are more useful than calendar dates in allowing respondents to make contact with other events in their own lives. More work is needed

124 to determine what exactly can be considered a landmark and how to circumvent certain inherent problems with the use of landmarks: generality for all groups and applicability of landmark for length of survey (for large surveys interviewing may continue for weeks or months). Two general modern of representation have been presented as the basis for frequency Judgments. The first, the strength model, maintains that subjects `Judge the frequency with which an event occurred by consulting a unidimensional trace of the event and determining the "strength" of the trace rather like the vividness of an impression. All other factors being equal, a more frequent event will be associated with a stronger trace than a less frequent event . By this account, information concerning individual instances (e.g., nuances of context) is not part of the memory representation. The alternative model, multitrace theory, has been proposed in various forms (e.g., Anderson and Bower, 1972; Hintzman and Block, 1971~. The different versions share the position that each occurrence of the same nominal event is represented separately in memory. The form of the representation may be in terms of lint markers indicating individual occurrences attached to the permanent address of the event in memory (Anderson and Bower, 1972) or an separate episodes with some index to the effect that they are all instances of the same nominal event (Hintzman and Block, 1971~. The current position among experimental psychologists is that strength theory cannot adequately explain the relevant experimental results ~ Crowder, ~ 976 ~ . Perhaps the best evidence against strength theory is provided by two experiments by Hintzman and Block (1971, Experiments II and III). Results similar to those of Hintzman and Block (1971) have been obtained by Andersen and Bower (1972) for subjects also judging local frequencies of events. Memory for details corresponding to the unique aspects of nominally identical events also is contrary to the predictions of the strength model. Clearly, people are able to retrieve information which differentiates among various instances of an event (e.g., Hintzman and Block, 1971; Linton, 1982~. The distinction between semantic and episodic memory (e.g., Tulving and Thomson, 1973) is in part a reflection of this fact. ~ Two general strategies have been described in the literature as methods of accessing memory traces for frequency Judgments: (~) counting of individual instances and (2) use of the accessibility heuristic. The first requires accessing individual traces by some means (discussed below), determining whether the instance in of the type being searched, and keeping an accurate count of these instances. It subjects do use this strategy, data on accuracy of absolute frequency counts attests to the fact that it is an error-prone process. Some of the possible error sites are: ache process of matching a retrieved instance to a description of the required one, tagging of already-counted instances to avoid double consideration, and updating of the counter. The second strategy, the availability heuristic (Tversky and Kahneman, 1973:208) is described as making n . . . estimates tof] frequency or probability by the ease with which instances or

125 associations could be brought to mind. This method does not require actual operations of retrieval, only a Judgment about the ease with which particular instances could be processed. Tver-~ky and Rahneman (1973, Experiments I-IV) find high, positive correlations between subjects' Judgments about the number of instances of various categories they would be able to recall and the number actually recalled by another group of subjects. These authors think that availability can be used to make frequency judgments, especially when individual instances are not distinct, are too numerous to permit counting, or occur at a rate that in too fast for accurate encoding. They caution, however, that availability can be affected by factors such as recency of occurrence and salience, which will result in imprecise judgments of frequency. Several of the experimental findings that have been reviewed here may have resulted from the use of the availability heuristic, for example, Lichtenstein et al.'s (1978) finding of overestimates due to salience of the event to the respondent and all of the other judgments of naturalistic events (e.g., Attneave, 1953; Erlich, 1964; Shapiro, 1969~. The research on frequency Judgment can make important contributions to research on errors in surveys, since many surveys ask respondents to report on the frequency of. Rome past behavior. We need to Mow more about the conditions in survey reporting under which respondents use the two strategies for making frequency judgments and more about what might affect their use of the availability heuristic. Given that people appear to be better at making relative than absolute judgments of frequency, we need to know more about the limits of accuracy in making absolute judgments and the types of reporting where it would be better to be content with relative Judgments than to try to get absolute, but inaccurate reports of frequencies. Conclusion In this paper we have reviewed the kinds of problems that are of concern to those working in the field of response effects in surveys and related them to some of the pertinent theories and findings of cognitive research. A number of the [actors shown to have effects in the experimental literature may not have analogous results outside of the laboratory. In the case of a factor such as the similarity of contexts at the time of encoding, the relevant data on what constitutes -similarity are not available. Similarly, the effect of spacing on frequency reports may be eliminated when interval between repetition i~ measured in terms of days rather than seconds and when the number of intervening seventh may be in the hundreds rather than less than ten. The same general critique may be applied to the [actors of type of. processing given to an event ~ e . g ., Rowe , 1 974 ~ and to thinking about an event ~ e . g ., Johnson et al ., 1 977 ~ . The argument here is not that these experimentally identified factors do not have implications for the design of surveys, but rather that the application to events outside of the laboratory is not direct and requires additional research.

126 The purpose of the paper is to structure a discussion among survey researchers, cognitive scientists, and statisticians to explore the ways in which work in the different fields can be brought together to enrich our understanding of errors in information processing, be they in the survey context, in other contexts of interest to researchers, or in ordinary discourse. From this discussion we hope that an agenda for future research will emerge. References Alba, J.W., Chromiak, W., Hanher, L., and Attig, M.S. 1980 Automatic encoding of category size information. J~U"mol nP ~: ~ =_~ 6:370-378. Anderson, J.R., and Bower, G.R. 1972 Recognition and recall processes in free recall. ~ ~~1m,;~' ·~1~ 79~2~:97-123. Attig, M., and Hasher, L. 1980 The processing of occurrence information by adults. Jourrmd of G~rontolo~v 35:66-69 Attneave, F. 1953 Bartlett, F.C. 1932 Psychological probability as a function of experienced frequency. _ 46: ~1-86. __ Cambridge, England: University Press. Belson, W.A. 1968 Respondent understanding of survey questions. folds 3: Bradburn, N.M. 1983 Response effects. In P.H. Ro~si and J.D. Wright, eds., The Handbook of Survey Research. New York: Academic Press. Bradburn, N.M., Sudman, S., and associates 1979 ~ . San Franci~co: Jo-~sey-Bas~. Bransford, J.D., and Johnson M. 1972 Contextual prerequisites for understanding: come investigations of comprehension and recall. Journal or Verbal Learning and Verbal Behavior 11:717-726. Brown, N., Rips, L.~., and Shevell, S.~. 1983 Temporal Judgments About Public Events. Unpublished manuscript, University of Chicago. Burke, D.M., and Light, L.L. 1981 Memory and aging: the role of retrieval processes. IL~t~LIYC~ L bLL~LIl 90~2~:513-546. Crowder, R.G. 1976 _ . Hilledale, N.J.. Erlbaum. Erlich, D.E. 1964 Absolute Judgments of discrete quantities randomly distributed over time. 67:475-482.

127 Fee, J . 1979 Symbols and Attitudes: How People Think About Politics. Unpublished doctoral dissertation, University of Chicago. Hasher, L ., and Chromiak, W. 1977 The processing of frequency information. I, 16:173-~84. Hasher, L., and Zacks, R. T. 1979 Automatic and effortfu] processes in memory. JO of ~ ~ 108: 356-388. Hir~tzman, D, 1969 Apparent frequency as a function of frequency and spacing of. repetit ~ ons . _ Bo: ~ 39_ ~ 45 . Hintzman, D.L., and Block, R.A. 1971 Repetition and memory: evidence for a multiple-trace hypothesis. 88~3) :297-306. Hintzman, D., and St em, L. D. 1978 Contextual variability and memory for frequency. dour of 4: 539-549. Howell, W.C. 1973 Storage of event frequencies: a comparison of two paradigms in memory. _~ 98:260-263. Jacoby, I.. L. g72 Context effects on frequency Judgments or word sentences. _ 94:255-260. Johnson, J.~., Taylor, T.H,, and Raye, C.L. 1977 Fact and fantasy: the effects of internally generated events on the apparent frequency of ext ernally generated events. ~ 5:116-122. Johnson, M.R., Raye, C.L,, Hasher, L., and Chromiak, W. 1 979a Are there developmental differences in reality monitoring? ~ 27:120-128. Johnson, M.~., Raye, C.L., Wang, At., and Taylor, T.H. 1 979b Fact and Notary: the roles of accuracy and variability in confusing imaginations with perceptual experiences. Journal An__ i_ 24_~ 5: 229-240 . Kausler, D. H ., Wright , R. E., and Hakami, M. K. ~98~ Variation in task complexity and adul t age differences in frequency of occurrence Judgments. Bullets or the ~—~5~! 18: 195-197. Kintsch, W., and van Dick, T.A. 1978 Toward a model of text comprehension and production. ~ 85:363-394. Lichtenstein, S., Slovic, P., Fischhoff, B., Layman, M., and Combs, B. 978 Judged frequency of lethal events. ~ _ =~ ~ ~ ~ 4:551-578. Linton, H. 1982 Transformations of memory in everyday life. In U. Neisner, ed ., Memory Observed . San Franci~co: i1. H . Freeman and Co .

128 Lofts, E.F. 1977 Shifting color memory. to LOLL YLLla~LllLI~] 5 696-699 1982 Interrogating eyewitnesses--good questions and bad. Chapter 4 in R. Hogarth, ea., 9~m1~_~_1e~ i, Vol. 4. San Francisco: Jossey-Bass. Loftus, E . F ., and Marburger , W. 1983 Since the eruption of Mt. St. Helens, did anyone beat you up? Improving the accuracy of retrospective reports with landmark events. ~ ~1 : ~14-120. Lofts, E.F., and Palmer, J.C. 1974 Reconstruction of automobile destruction: an example of the interaction between language and memory. ~ ~ 3: 585-589. Marquis, K.H., Cannell, C.F., and Laurent, A. 1972 Reporting for health events in household interviews : effects of reinforcement, question length and reinterviews. Vital ~ rs. National Center for Health Statistics, Pub . ~ 000 , Series 2, No . 45. Washington, D . C .: U . S. Government Printing Office. Murray, J.R., Minor, M.J., Cotterman, R.F., and Bradburn, N.M. 1974 Household. Report No. 126. Chicago: National Opinion Research Center. Neter, J., and Waksberg, J. 1964 A study of response errors in expenditures data from household surveys. AJ~OC;atLOn 59:18-55. Noelle-Neumann, E. 1970 Wanted, rules for wording structured questionnaires. Public 51~ ~ 5~ LIBEL 34:191-201. Orne, M.T. 1969 Demand characteristics and the concept of quasi-control~. Pp. 143-179 in R. Rosenthal and R.L. Roanow, eden., Artist A. New York: Academic Press. Payne, S.L. 951 A. Princeton: Princeton University Press. Pinchert , J . W., and Anderson , R . C . 977 Taking different perspectives on a story. dour of ~ 69:309-315 Rose, R.J. 1980 Encoding variability, levels of processing, and the effects of spacing of repetitions upon Judgments of frequency. ~v "d ~;~ 8:84-93. Rowe, E . J . 1974 Depth of processing in ~ frequency Judgment task. ~urna1 he ~ 13:638-643. Rugg, O . 1941 Experiments in wording questions, II. Able c Orion An 5:91-92.

129 Schuman, H., and Presser, S. 1981 ~ _ _~. Press. Shaciro. B.~. New York: Academic ~ _ , 1969 The subjective estimation of relative word frequency. ~ 8:248-251. Smith, T. 1980 ~= _~_ Seal Sever. Technical Report No. 21. Chicago: National Opinion Research Center. Sudman, S., and Bradburn, N.M. 1974 ~ _~. Chicago: Aldine. Tulving, E. 1972 Episodic and semantic memory. In E. Tul~ring and W. Donaldson. eds.. A= . New York: Academic Press. Tulving, E., and Thomson, D.M. 1973 Encoding specificity and retrieval processed in episodic memory. Al ~ 80~5~:352-373. Tulving, E., and Wiseman, S. 1976 Encoding specificity: relation between recall superiority and recogul~lon r-allure. ~ ~ ~ _ 2:349-361. Tverelcy, A., and Rahneman, D. 1973 Availability: a heuristic for Judging frequency and probability. ~ 5:207-232. Underwood, B.J. 1969 Some correlates of item repetition in tree recall learning. ~ ~ 8:83-94. Vogel, F.A. 1974 How Questionnaire Design May Affect Survey Data: Wyoming Study. Statistical Reporting Service, U.S. Department of Agriculture, Washington, D.C., Mimeo. Warren, L.R., and Mitchell, S.~4. 1980 Age differences in Judging the frequency of events. ~1~_~ 16:116-120.

RECORD CHECKS FOR SAMPLE SURVEYS Rent Marquis We need to include validity checks in research on survey cognitive processes for at least two reasons: (~) to help define the measurement problems needing explanation and (2) to know when proposed solutions have improved the answers that respondents give. Record checks are one kind of validity check. They are a form of Criterion validity" in which survey interview responses are compared with criterion or ~true" values on a case-by-case basin. But the appropriat e design of record checks is not so straightforward as the definition might "ply. If the design is incomplete, as often happens in the field of, health survey methods, we risk drawing the wrong conclusions about the nature of reporting problems, and we risk implementing measurement solutions that may actually do more harm than good. In this paper I first present the generic record-check designs and their response bias estimators. Then, in the second section, ~ work through some numerical examples to illustrate the design and interpretation problems of incomplete record checks. From this we will learn that the full record-check design is robust to several important kinds of design and process mistakes that can occur. In the third section I view the empirical results of health survey record checks on the reporting of hospital stays and physician visits. These results tend to confirm the existence of the design-induced interpretation problems that were illustrated in the previous section. These studies also suggest that the net response biases for survey estimates of health service use are close to zero. Some implications of the zero net survey bias are mentioned in the last section. Basic Logic of Record Checks In this section we view the underlying logic of record-check designs and the inferences about survey response bias. For didactic purposes I restrict our attention to the binary variable case: we will learn about three ~pure" record-check design possibilities and the bias estimators that go with each of the designs. Table ~ shows the cross-classified record-check outcome possibilities for a dichotomous variable such an whether a person was hospitalized, visited a physician, or has a particular chronic health condition. The record outcomes (yes, no) are arrayed along the top; the survey outcomes are arrayed along the left side. The cells contain the cross-classified values reflecting both the survey and record observation outcome. On the agreement diagonal, the A cell contains the observations for which both the record and the survey indicate eyes n while the D cell contains agreements about the absence of the characteristic. The off-diagonal cells, B and C, contain the disagreements or errors: it is the

131 TABLE ~ Basic Record-Check Matrix for Binary Variables With No Missing Data Survey Record Yes No Total Yes A B A ~ B No C D Total A + C A + B + C ~ D misestimation of these cells (e.g., their frequency or probability) that leads to the common =sinterpretation problems. Net survey bias is the discrepancy between the survey estimate of the population characteristic and the true value of the characteristic. In the framework used here, the survey estimate is (A + B)/(A + B + C + D), and the true value is (A + C)/(A + B + C + D). This discussion deliberately omits the possibilities of nonre.~ponse, processing, and sampling biases, so we may interpret the discrepancy as the response bias. There are three different ways of conducting a record check, which I call the AC design, the AB design, and the full design. Collectively I refer to the AC and AB designs as partial or incomplete designs. In the pure AC design, which has been used widely in health survey studies, we first go to the records to select cases with the characteristic of. interest present. This might be a sample of people with hospital admissions, people with doctor visits, or people with a chronic condition such an one of the heart diseases. We then interview theme people to see if they report the characteristic of interest. Our estimate of survey bias is C/(A + C), the underreporting rate. Note that we ignore the B errors. The pure AB design is a different approach. Here we conduct the survey first and then check records for people who report the presence of the characteristic of interest. If Ms. Smith reports a visit to Dr. Brown, for example, we write to Dr. Brown and ask him to confirm or deny the validity of Ms. Smith' survey report. We estimate the survey bias as B/(A + B), which we call the overreporting rate. Note the absence of C-cell information. We can arrange full designs in several ways. In health survey studies the most common approach is to identify a population of interest, sample from it independently of record or survey values of the characteristic(~) of interest, obtain both survey and record information for each sampled element, and compare the two information sources . The principal feature of the full design is its ability to obtain unbiased

132 estimates of at leant cells A, B. and C. The full design estimate of survey response bias in (B - C)/(A + B ~ C ~ D). Illustrations of Record-Check Inferences About Net Survey Response Bias I use numerical examples next to illustrate the interpretive pitfalls inherent in estimates that come from incomplete record-check designs. In an earlier paper (Marquis, 1978a) ~ provide formal derivations of the conclusions that the AB and AC designs (~) can overestimate the size of the response bias and (2) can misestimate the direction of the bias (e.g., inferring forgetting when the predominant errors are false positive responses). Using whether hospitalized as the variable of interest and referring to Table 1, we get entries on the agreement diagonal, cells A and D, when the survey and record are either both yes (both report a hospitalization or both no (both deny a hospitalization ~ . 1 For convenience, let's assume that the record is always correct and that each person in the population either was or was not hospitalized and reports either yes or no. Then the true rate of persons hospitalized is (A + C)/(A + B + C + D), and the survey estimate is (A + B)/(A + 8 + C + D). [he net survey or response bias is the difference between the true rate and the survey estimate. Illustration 1: Bins Estimates From Partial Designs are Too Large Because the Estimators Use the Wrong Denominator One of two problems in the partial design bias estimators is that they use a denominator that excludes come relevant information. This results in an estimate of the response bias that is too large. For example, assume we have a population of 100 people and that records show that 60 of them stayed in the hospital at least once within the last 12 months. If interviewed, assume 10 of these 60 people would fail to report being hospitalized . Also, if we interviewed the 40 people who were not hospitalized, assume they would all report thin fact correctly. The cro~s-clas~ified observations for this example are in Table 2. ~ The derivations in Marquis ~1 978a ) address the effects of record errors on the estimates of survey bias. 2The partial design estim~ tors are conditional error rates . They are "wrong" only in the sense that they are often misinterpreted an estimates of net bias. The other problem with a partial design estimate is with the numerator and this is discussed later.

133 TABLE 2 Hypothetical Data Illustrating Denote nator Problem Survey Truth or Perfect Record Total Yes No Yes 50 0 50 No 10 40 50 Total 60 40 100 The true survey bias is the AC design estimate in (A + B) - (A + C) (A ~ B + C + D) - = ~ e .10 ; C = - 10 - -.17 A ~ C 60 . The correct denominator for an AC design estimate is A + B ~ C + D and, if used in this example, provides the desired estimate of the survey bias, _ 10 = _.10. 100 Illustration 2: Estimates From Partial Designs Can Yield the:Wrong Sign for the Survey Bias Because an incomplete design makes it impossible to observe all the survey errors, the bias estimate could have the wrong sign. For example, assume we have a population of 100 people, 60 of whom stayed in the hospital at least once within the past 12 months. If we interview the hospitalized people, assume 55 of them would report this correctly, If we interview the 40 people who weren't hospitalized, assume JO would report that they were hospitalized (possibly because they "telescoped hospital stays ~ ~~ ~~ 12-month reference period). example are in Table 3. Occurring more than 12 months ago into the The cross-classified observations for this

134 TABLE 3 Hypothetical Data Illustrating the Misestimation of the Sign of the Bias Survey Truth or Perfect Record Total Total 60 40 ~ 00 The true survey bias is 6 `6 0 100 = .05; The AC design estimate is -5/60 = -.08 ; the AC design estimate with correct denominator is -5/100 = -.05 The AC design estimates, -.08 and -.05, have the wrong sign. Although the true net survey bias is positive, the AC design provides estimates that suggest that the bias is negative. The reader may wish to create another example to show that the AB design will provide a positive response bias estimate when the true survey bias is negative [viz., (A + B) < (A + Cal. The results will be clearer if you make the B value greater than zero. Illustration 3: Partial Design Estimates Overstate the Size of the Net Survey Bias in the Presence of Match Errors Let us now introduce match errors, that is, mistakes in cross-classifying the survey and record values. Match error-- occur either (~) because one of the sources gives incorrect information about the details of an event that precludes a correct match or (2) because someone makes a mistake in carrying out the matching procedures. For this illustration we assume that all survey and record information in matched, that is, there are no "left over" interviews or records when we finish the matching operation. With thin assumption, each time we make a match error, we cause a ~compensating" match error to occur somewhere else. The illustration to follow shows that, under the "no nonmatch" a~s~,mption, the full design estimate of net response bias is unaffected by match errors (see Neter et al., 1965, for the proof), but the partial design estimates are biased when match errors are present.

135 It is common practice in record checks to require that the survey and record information agree (within tolerance limits) on a number of dimensions other than the characteristic of interest before an entry is made on the agreement diagonal for the characteristic of interest. In our hospital stay record check, for example, procedures might require agreement on the patient's name, address, sex, the name of the hospital, and possibly the date or length of stay within some range. An attribute reporting error in a response mistake about one of these match variables that prevents a (correct) cro~-classification of the survey and record information. For example, suppose Mrs. Smith mistakenly reports that her son Tommy was In the hospital to have his tonsils removed when it was really her daughter Suzy who had that experience. This one mistake about the name attribute would generate two classification errors: Tommy's event would show up as a count in the B cell, and Suzy'~ event would show up as a count in the C cell. Note that Mrs. Smith correctly reported the number of hospital stays for her family, but our speci al match requirements for the record check cause the report to appear an two offsetting errors. Simi larly, a careless clerk could mismatch Tommy' ~ and Suzy' ~ correct interview reports to the hospital information, generating entries in cells B and C instead of in the A and D cells where they belong. To show the effects of offsetting match errors, let's interview all or part of. another population of. 100 people and insert only offsetting match errors into the cross-classified observations, as in Table 4, in which 20 match errors have generated 40 off-diagonal entries. TABLE 4 Hypothetical Data Illustrating Match Errors Survey Truth or Perfect Record Total . Yes No Yes 50 20 70 No 20 10 30 Total 70 30 ~ 00 1 Both the true bias and the full design bias estimate are zero. The adjusted (using the correct demoninator ~ AB design estimate is . 20; the adjusted AC design estimate is - .20. Both the adjusted AB and AC approaches overestimate the ~ absolute value of the) survey bias by including half of the unsystematic (off setting ~ errors in their estimate of the systematic survey bias .

136 To generalize a little further, any random (offsetting) mistakes made by respondents, interviewers, and processing clerks that do not change the (expected value of the) subJect-matter estimate from the survey cause the AB design inference of a systematic positive response bias and cause the AC design inference of a systematic negative (i.e., forgetting) bias. Since random mistakes probably are inevitable, the continued use of incomplete record-check designs will continue to mislead us about the direction and size of net survey response biased. Illustration 4: The Combined Effects An a final example, I have generated a matrix reflecting the various kinds of systematic and unsystematic errors discussed above to chow how full and partial record-check design estimators handle them. Table 5 's based on the following assumptions: a population of 100 people; 60 people were hospitalized; 10 of the 60 would fail to report the hospitalization (i.e., forgetting) if interviewed; 40 people were not hospitalized; 5 of the 40 would falsely report a stay (e.g., telescoping) it interviewed; and 15 match mistakes that generate 30 cross- classification errors. TABLE 5 Hypothetical Data Illustrating the Combined Effects of Errors Survey Truth or Perfect Record Total Yes No 1 50 true positive reports 5 [else positive errors Yes _16 stat errors +16 match errors 55 35 A-cell entries 20 B-cell entries 10 false negative errors 35 true negative reports No +1S match errors -1R match errors 45 25 C-cell entries 20 D-cel' entries Total 60 40 100 The true survey bias is (A + B) - {A ~ C, A ~ B + C + D the full design estimate is ~ - C A + 8 + Con D the AC design estimate is - C = - 25 = -.42 ; A + C 60 55 - I = -.05 ; 100 100 20 _ 26 = _.05 ; 100

137 the adjusted AC design estimate is - the AB design estimate - + B A ~ B the ad justed AB design estimate is C ,, A ~ B + C ~ D = 2Q - .36 ; 55 B ~ = .20 A ~ B + C ~ D . The AC estimate has the correct sign (minus ~ but is too large. The adjusted AC estimate shrinks in the right direction (toward -.05) but not far enough. The ad justed estimate (- .25 ~ reflect the 10 [else negative -survey errors and half ( 15) of the 30 match-error entries. It, of course, completely ignores the 5 false positive errors and the other half of, the match errors. In this example the AB estimate (~.36) has the wrong sign; the unadjusted estimate is too large; and the adjustment shrinks it in the right direction ~ toward - .05 ~ but not far enough . lathe adjusted AB estimate (~.20) reflects the five telescoping errors and half of the match errors but is unaffected by the presence of ~ O forgetting errors and the "compensating" match errors. The conclusions I draw from the above are in the nature of cautions. In defining the survey problems to approach using cognitive theories and methods, do not assume that forgetting dominates survey reporting error. Much of the evidence for forgetting comes from AC design record checks whose design and bias estimator guarantee large negative values even in the presence of zero bins or a positive net bias. The corollary caution concerns future laboratory studies one might undertake. If these involve a criterion validity component, the full design principles are just as applicable in the laboratory as in the field . Record-Check Estimates of Reporting Errors in Health Surveys How well do the logic-based principles of record-check problems hold up in practice? In this section we look at actual record-check estimates of survey reporting bias for hospital stays and physician visits. We will see that the full design studies tend to estimate very small net reporting biases (close to zero), while the incomplete AC designs produce a negative bias estimate (e.g., forgetting), and the incomplete AB designs yield positive estimates of response bias. The effect of the incomplete design approaches on the size and sign

138 of the response bias is as predicted: the idea of a dominant forgetting bias appears to be a methodological artifact. ~ show estimates of hospital stay reporting errors first and then estimates of physician visit reporting errors. Hospital Stay Reporting Biases The hospital stay bias esteem ten from the three kinds of record-check designs (Table 6) show the expected effects with respect to the direction of the response errors . The AC designs imply a net omission ~ i . e ., forgetting) bias and the AB designs imply a net positive (i.e., telescoping ~ response bias. The two ABC design studies come close to full design procedures; both find approximately equal rates of underreporting and overreporting; one estimates a slightly higher relative underreporting rate while the other estimates a slightly higher overreporting rate. Although a full design estimate inn 'A possible (neither study provides enough information to estimate the D-cell ), net hospital stay response bias is probably very small across purveys, and most of the incomplete designs miss this conclusion entirely by concluding the existence of a large directional bias. Some of the studies that used incomplete designs obtained information about the Missing. cell; for example, Campbell et al . ~ ~ 967 ~ found some apparent overreports in their AC design and Ander-nen and Anderson found some underreports in their AB design. This can happen if the person reports (or the record contains) more than one hospital stay for the person and the two sources disagree about the extra staying. But multiple stays are atypical events and not a good basis for interring a value for the missing cell. Occasionally researchers have pointed out the low frequency in the Ming cell of incomplete designs and offered this as empirical evidence for assuming that those kinds of' errors are infrequent enough to ignore. But readers who have stayed with me this far can see the potential fallacy in that argument. The modified AB designs, which were used in the studies at the bottom of Table 6, unsuccessfully tried to turn an AB design Onto a full design. The modification was to check the records of all hospitals that each family reported using for all members of. the faintly. For example, it Mr. Jokes said he stayed at Metropolitan Hospital, the researchers 3The hall design estimates suggest the absence of. subst~tia' net reporting biases. This is not something that can be derived from the record check design and estimation characteristics. It is ~new" information that Speculate about in the final section. A reviewer suggests that ~ caution readers not to generalize this result to response errors in reporting details of hospital admissions. Methods and estimates of the latter kind are given in Marquis (1980, Section IIT). For example, using ~A-cell" cares for which both survey and record information was available, out-of-pocket cost net bias was close to zero while reports of length-of-stay and month of admission showed a small net positive bins.

139 Cot a. a, o :c o ho o Cot 0 ~ be 0 ~ o, 0 .= ~ C) C. ~ t ~ Cat o c) at; w o ~ o o 0 O EN or a: or : > CQ : D Cat O S or :5 o ~ ~3 J m ¢ ED be ~ _ 0 0 ~ ~ to_ 0 _ I: 0 ~ 0 C, ~ I: at: ~ _ o 0 I l ° ~m'~ JO Or ~ ~ ~ - - _ _' _ _ _ _ 1 1 1 1 1 1 c: bo o 0 ~ 0 ~ ~ m ~n ;~ 0 0 e ~ C, 5: C, 6o - d— ~; 1 — 1 — - _ ~o _ 0 :, 0 C, o ~ 1 0 1 0 _ ~ 1 `0 1 ~ _ _ C~ to C~ _ 0 _ ~o . 0 `: ~ ~ ~ ~ ~ ~: 00 a c: 0 m c~ c~ ~: o 03 03 :, 0 0 ~: o c) , 1 ' ' O 1 , ~ ~: C) 0 C' o c) 0 4) C~ ~ ~: 53 C) ~ C, y C~ e, 0 0 CI: s 0 C, ~ C~ O ~ 0 ~ ~ 0. C, ~ 0 ~ ~ O C) 0 a. `4 0 ~ 0 C~ - ~ I 0 m =; 1 0 1 1 — 1 1 m __10 ~ 00 ~ 1 \0 ~o _ _ ~ 01_ _ _ \0 _ _ · 0` - _ 0 `0 ~ — C) - 0 ~ O!) O O o, 0 1 1 C— - :— · - C~ _ O O ~ ~ _. 0 m m m ~: 0 0 £: bO 0 03 C a' 0 ~: :~: 0 Co - - 0 0 ~ 03 0 0 J 0 e _ 0 0 _ - ~: 0 0 a~ ~ - C, _ S 0 _ ~ D o :~: e~ _ C~ - ~ - ~ bO ~ 0 ; 0 o 0 0 ~ 0 0 0 o, ~ o , ~ 0 o, 0 ~ cs ~ o ~ ~ 0 ~ s o ~ ~: o ·e 0 S C) O td Z ~

140 also asked Metropolitan Hospital to check its records for stays of Mrs. Jones and for each of the children. While the modified procedure does yield more C-cell entries, the C-cel' estimate in still not an unbiased estimate of this population parameter. The resulting estimate of the net response bias remains unsatisfactory. Physician Visit Reporting Biases Record-check estimates of errors in reporting physician visits also illustrate the design biases of the incomplete approaches. There data illustrate two other things: (~) the effect of a bias in the record information (it can lead to an interpretation of a reporting bias with the opposite sign) and (2) the length of the survey reference (recall) period apparently does not have the systematic effect on omissions that a memory decay theory would predict (the omission bias does not seem to increase as the length of the average recall period gets larger ~ . These three points are discussed next using the estimates in Table 7. The incomplete designs yield physician visit reporting bias estimates with the predicted signs. The three studies using AC designs (asking about doctor visits known from records) produce large underreporting bins estimates and suggest that either about a quarter, a third, or a half of known visits are not reported in surveys. On the other hand, the two AB design studies yield positive (overreporting) estimates of the response bins . Within each of the ABCD design studies, the estimates of over- and underreporting rates are approximately equal, raising the possibility5 that the net reporting bias is not significantly different from zero. The ABC design estimates from Feather (1972) show how record bins can distort estimates of the survey reporting bins, even when using something close to a full design. Her overreport estimate (46 percent) is substantially larger than her underreport estimate (14 percent), suggesting that, on the average, her respondents reported a lot of physician visits that actually did not occur within the 2-week reference period of the survey. Feather's records of doctor visits were summary "tee ~ubmissions,~ which are bills for complete outpatient services, each of which may represent more than one office visit (e.g., a visit for the complaint, a visit for diagnostic tenting, several visits for treatment, and possibly one or two follow-up visits to monitor the treatment's effectiveness). Thus the records ~underreport" visits. When they are crons-cla-~sified with the survey reports, there are a large number of B-cell entries (respondent reports of visits that cannot be matched to record reports of visits ~ . The effect is to int1 ate the survey overrepor~cing estimate, B/(A + B). Had Feather calculated a full design estimate, (B - C)/(A ~ B + C + D), it would have had a large positive value also. This is an illustration of the general principle that record biases can show up as response biases with the opposite sign. 51 red, Marquis et al. (1979) used the information in these studies to pr de full design estimates of the net response bias. These estimates ar- .ot significantly different from zero.

141 o ~ A: o ~ ~ ~ C' a: o _ ho · - o ho A: o o Cat o of lo; · - be of I: of C) · - aC of Cat , o C) at; ·. o C, o u3 ~ En a: ¢ En D ~0 O O ~ c: to: 3 ~— C, A: ~ O C" a: ~ 0 of P At: P. be _ 0 C~ L) :' o' P. m - 0 — U~ ~ ~ ~o ~ ~o _ _ ~ C~ ~ ~o C~ 1 1 1 a 3 ~ ~ ~ 3 ~ 3 a ~i ~ ~ c~ cu ~ c~ C C C ~ ~ C ~ C ~ C ~ o, · R ' ~ — _ 3 3 ti30 11 3 ~ ,. ~ C~ 0 ~o ~ ~o 0 - o o, o' o' o, p :3 al oq 0 ~; ~ ¢ o C, o: o c) 0 C~

142 The last point about Table 7 concerns memory deca, forgetting bias. The studies use different reference period lengths, ranging from 2 weeks deco 12 months. (They also use different crisis measures but the visit measure and the reference period length are not completely confounded.) If the memory decay hypothesis is correct, we should observe higher rates of omission (underreports) in studies that use longer recall (reference) periods. But the underreporting rates are not always larger for the longer reference periods. For example, restricting our attention to the "any visits measure, the 36 percent underreport estimate for 2 weeks in the Bal?~uth et al. (1965) study is larger than the 11 percent underreport estimate for 12 months in the Loewenstein (t969) study.6 I return to the memory decay issue in the next section. Reporting Biases for Other Variables The principles illustrated above also apply to record checks for other objective subject matter variables. Of particular interest is a recent summary of Insensitive topics record checks (Marquis et al., 1981~. The topics include receipt Or welfare, wage or salary income, illegal drug use, alcohol consumption, arrests ~d convictions for crimes, and embarrassing chronic conditions (e.g., h~orrho$~s and manta] illness). Not surprisingly, there have been incompletely designs record checks in these areas, and the conclusions drawn (~.g., people won't report socially undesirable information) are a function of the type of design used. Surprisingly (at least to me), the distribution of bias estimates from the full design record checks center on zero or possibly a small positive value for most of the sensitive topics, suggesting that the errors respondents (or records, or match procedures) make are largely Offsetting on the average (rather than being mostly in one direction) as would be the case for forgetting or lying about sensitive information. Implications for Future Research In this section I speculate about the implications for research of a net reporting bias close to zero, mention an example of a model that might fit the data, and suggest that perhaps a different conceptualization of offsetting errors is needed for health service use reports. The empirical findings from Holly design" record checks of hospital slay and doctor visit reports suggest that the net response bias is approximately zero, at leant for surveys designed an those involved in the cited record checks. ~ see three implications of these findings. One implication in that forgetting (and its possible underlying causes, such as [allure to acquire the information, failure to retrieve 6However, within a single reference period length, record checks usually show a positive correlation of underreport probability and elapsed time between the event and the interview. See Cannell et al. (1977) or Table 3 in this paper for examples.

143 it, or decisions not to report it) in not necessarily the dominant response problem in health surveys as we normally design them. Thus, cognitive research that focuses only on forgetting hospital stays and doctor visits may not have much applied value for contemporary health surveys. A second implication for cogn' Live research on health surveys in that it should contain criterion validity features such as fully designed record checks or carefully thought-out strategies of construct validity. Over the past 10-15 years, survey methodologists have sometimes substituted the assumption that "more is better" for empirical validity studies, inferring that a survey procedure that produces more reports of something yields better estimates than a lens productive procedure. But the more-the-better assumption is unwarranted when fully designed record checks show that the net response bias for current procedures is approximately zero. A third implication is that survey methodologi~t~ may want to reexamine the effects of recent changes in health survey designs that were incorporated to reduce forgetting biases (see, e.g., Cannell et al., 1 977, Appendix ~ . Such changes may be producing a net positive response bias. A zero net response bias does not mean that survey responses are given without error, only that the errors are offsetting when calculating the survey mean (proportion, or other first-moment statistic). Offsetting errors are of concern to survey designers and analysts because they place limits on the precision of some population estimates and cause biased estimates of other population parameters (e.g., coefficients of associations. Cognitive science can make an extremely important contribution to survey design by describing (e.g., modeling) these errors and discovering what causes them (especially if the causes can be influenced by features of the survey design such as the construction of question sequencer , the length of recall periods, or the behavior of the interviewer). The important point here is that the errors to be described, understood, and controlled can be offsetting or possibly random. They are not the product of a single cognitive process that produces only omissions. Whatever cognitive processes are operating can produce Just as many false positive reports as false negative reports, so our explanations need to expand to take this phenomenon into account. Sudman and 8radburn ( 1973) provide an example of a relevant response error model and show that it can make useful predictions in household expenditure surveys. They assume that forgetting and telescoping are two separate cognitive processes that are affected by the length of the recall interval. The proportion of forgetting response errors increases as the recall interval increases while the proportion of telescoping errors decreases with elapsed time. lithe effects of the two types of errors will be approximately equal, then, for one particular reference period length . So far, the full design record-check results do not contradict this formulation if one assumes that the omission and telescoping tendencies have approximately balanced out to create a net reporting bias close to zero in all of the full design studies examined. In addition, Cannell et al. (1977) cite several AC design studies that show an apparent increase in omissions with an increase in e, apsed time.

144 But is this another methodological artifact or are we, indeed, observing half of the compensating phenomenon suggested by Sudman and Bradburn? If, for example, we could find AB design studies that show o~rerreporting rates inversely proportional to elapsed time, the compensating forgetting and telescoping explanation would receive support in the health survey context. Unfortunately, none of the AB design record-check studies have published elapsed time data. Feather ' ~ ABC design study (Feather, ~ 972 ~ does not provide elapsed time data for both hospital stay underreporting and overreporting. I have adapted there data in Table 8, using the record version of date of admission (where possible) to make the elapsed time classification. Since date of admission was asked only for the most recent (or only) hospital stay, we must confine the analysis to these stays (somewhere between 75 and 80 percent of all reported and of all recorded stays). The overreporting and underreporting trends in Table ~ do not support the Sudman and Bradburn model. Both overreporting and underreporting increase with elapsed time. Although not shown, the underreporting trends are similar to those cited by Cannell et al. ~ ~ 977), suggesting that the reporting dynamics in Feather's research are similar to those operating in the other record-check surveys that ask about hospital stays. Thus, a different error model may be needed for reporting of hospital stays and physician visits. We do observe, in the Feather data, an increase in both kinds of response errors with greater elapsed times between the survey and the event occurrence. It in thin phenomenon (that Rome survey methodologists label random response error or simple response variance) that cognitive science might address. For example, is this apparent increase in "carelessness" with elapsed time characteristic of reporting of other kinds of events? Is it due to problems recalling the relevant event attributes correctly? Are the reporting problems due to faulty demigods of the respondent about whether to report a recalled event or in the retrieval of incorrect information about the event followed by a logically correct decision about reporting? Understanding these kinds of phenomena is not going to be gained by traditional survey evaluation methods such as record checks. They need the kind of creative hypothesis formulation and carefully controlled laboratory testing that cognitive science can provide. 1

145 A: C, s C' 1 o c) A: · - 3 ·rl ~0 o lo; U) 00 o :r: a) a: C) A: o C) Cal m e En V) 0 o ~ a) c) ~ 0 o 0 c) ~ a: :D c) · - o, D4 O a, ~ 0— :: _ o 3 Z 00 S :^ 00 O O a. ~ 0\ TO _ a' up :^ :3 o C' ~ - ~ - - s b~ 0 ~ ~s ~ ) oq , ~: - ~l ~: ~ 0 0 m C~ C) E~ :^ 3 U. a, SC~ a' _ 1 1 — a' CM S :~ D 0 ~: a a, o C, C~ o' C) S C' C) :~ 0 `: ~ D V o C) o C) a, 3 0 0 S C} E~ o a. 0 :~ :, a, S. o a. 0 o 0 S S O o~ a) :5 a a. c~ ~ ~ ct - : ·- ;4 ~ o ) o ~ z ~ o o.

146 References Andersen, R., end Anderson, 0. 967 I. Chicago: University of Chicago Press. Andersen, R., Gravity, J., and Anderson 0., ens. 1975 ~_~ Poliov. Cambridge, Mass.: Ballinger. Andersen, R., Rasper, J., and Franicel, M. 1979 I. San Francisco: ~Jossey-Bass. Balamuth, E., Shapiro, S., and Densen, P.~. 1961 Health interview responses compared with medical records. _, Publication No. 584-D5, Ser. D, No. 5. Washington, D.C.: 0.S. Public Health Service. Barlow, R., Morgan, J., and Wirick, G. 1960 A study of validity in reporting medical care in Michigan. Pp . 54-65 in ~ _ Washington, D.C.: American Statistical Association. Cartwright, 1963 Feather, J. 1972 Belloc, N. B . 1954 Validation of morbidity survey data by comparison with hospital records. ~ _ ~ 49: 832-846. Cannell, C. F., and Fowler, F. J. 1963 Comparison of hospitalization reporting in three survey procedures. ~ Ser. D., No. 8. Washington, D. C.: O. S. Public Health Service. Reprinted in Vi tat am ~ M I, Ser. 2, No. 8, July 1965. Cannell, C,F., Fisher, G., and Bakker, T. 1961 Reporting of hospitalization in the health interview survey. I, Ser. D., No. 4. Washington, D. C.: U.S. Public Health Service. Reprinted in Statl~i~, Ser. 2, No. 6, July. 1965. Cannell, C., ~guis, R., and Laurent, A. 1977 A sugary of studies of interviewing methodology. Vital and I, DH~1 Publication No. (ERA) 77-134B, Ser. 2, No. 69. Washington, D. C.: U. S. Government Printing Of.tice. A. Memory errors in morbidity survey. = 0~ 41:5-24. A Response/Record Discrepancy Study. 0ni~rersi~cy of Saskatchewan, Saskatoon, November 1972. Available from the National Technical Information Service, Springfield, Virginia. Kirchner, C., Lerner, R. C., and Cla~rery, 0. 1969 The reported use of. medical care sources by low-in come inpatients and outpatients. I_ 84: 107-1 17.

147 Lowenstein, R. 1969 Two Approaches to Health Interview Surveys. School of Public Health and Administrative Medicine, Columbia University. Madow, W. G. 973 Net differences in interview data on chronic conditions and information derived from medical records. ~ Row Stating as, DHE~ Publication No. (BRA) 75-1331, Ser. 2, No. 57. Washington, D. C.: U. S. Department of Health, Education and Welfare. Marquis, R . H. 1 97Ba Inferring health interview response bias from imperfect record checks. Pp. 265-270 in _9 on Survey Research Methods. Washington, D. C.: American Statistical Association. 1978b RYecord Check Y~iditY_of Rev Responses ,,,, , __ , _~.- _ _ A Reassessment 1980 _, R-23 1 9-HEN. Santa Monica, Calit.: The Rand Corporation. t~ m~ 6~ ~~ i, R-2555-HE-W. Santa Monica, Calit.: The Rand Corporation. _. Marquis, R., et al. 1979 ~ ~_, N-1 152-HhW. Santa Monica, Calif.: The Rand Corporation. Marquis, R.R., et al. 1981 ~ _' R-2710/2-HHS. Calit.: The Rand Corporation. Santa Monica, Net er, J., Maynes, F. S., and Ramanathan, R. 1965 The effect of mismatching on the measurement of response errors. 60: 1005-1027. Sudman, S., and Bradburn, N.M. 1973 Effects of. time and memory factors on responses in surveys. ~ _ 68:805-~15. Sudman, S., Wallace, W., and Ferber, R. 1974 The Cost-Effectiveness of Using the Diary an an Instrument for Collecting Health Data in Household Surveys. Survey Research Laboratory, University of Illinois.

Next: APPENDIX B DESIGNING AND BUILDING THE BRIDGE »
Cognitive Aspects of Survey Methodology: Building a Bridge Between Disciplines Get This Book
×
Buy Paperback | $50.00
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF
  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!