Appendix B

Research Agenda Topics Suggested by the Literature

RESEARCH RECOMMENDATIONS FROM NATIONAL RESEARCH COUNCIL (1983)

As part of its three-volume report, the National Research Council Panel on Incomplete Data in Sample Surveys prepared separate sets of recommendations for improving survey operations and for structuring future research on nonresponse and other issues. The following text excerpts the 11 recommendations offered on future research (National Research Council, 1983, pp. 11–14).

The recommendations on research have three objectives: to provide a capital investment in computer programs and data sets that will make nonresponse methodology cheaper to implement and evaluate; to encourage research on and evaluation of theoretical response mechanisms; and to urge that long-term programs be undertaken by individual or groups of survey organizations and sponsors to provide for and accomplish cumulative survey research, including research on nonresponse.

Recommendation 1. General-purpose computer programs or modules should be developed for dealing with nonresponse. These programs and modules should include editing, imputing (single and multiple), and the calculation of estimators, variances, and mean square errors that, at least, reflect contributions due to nonresponse.

Recommendation 2. Current methods of improving estimates that take account of nonresponse, such as poststratification, weighting methods, and hot-deck imputation, especially hot-deck methods of multiple imputation, require further study and evaluation.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 133
Appendix B Research Agenda Topics Suggested by the Literature RESEARCH RECOMMENDATIONS FROM NATIONAL RESEARCH COUNCIL (1983) As part of its three-volume report, the National Research Council Panel on Incomplete Data in Sample Surveys prepared separate sets of recom- mendations for improving survey operations and for structuring future research on nonresponse and other issues. The following text excerpts the 11 recommendations offered on future research (National Research Coun- cil, 1983, pp. 11–14). The recommendations on research have three objectives: to provide a capital investment in computer programs and data sets that will make nonresponse methodology cheaper to implement and evaluate; to encour- age research on and evaluation of theoretical response mechanisms; and to urge that long-term programs be undertaken by individual or groups of survey organizations and sponsors to provide for and accomplish cumula- tive survey research, including research on nonresponse. Recommendation 1. General-purpose computer programs or modules should be developed for dealing with nonresponse. These programs and modules should include editing, imputing (single and multiple), and the calculation of estimators, variances, and mean square errors that, at least, reflect contributions due to nonresponse. Recommendation 2. Current methods of improving estimates that take ac- count of nonresponse, such as poststratification, weighting methods, and hot-deck imputation, especially hot-deck methods of multiple imputation, require further study and evaluation. 133

OCR for page 133
134 NONRESPONSE IN SOCIAL SCIENCE SURVEYS Recommendation 3. Theoretical and applied research on response mecha- nisms should be undertaken so that the properties and applicability of the models become known for estimates of both level and change. Recommendation 4. A systematic summarization of information from various surveys should be undertaken on the proportions of respondents for specified parts of populations and for particular questions in stated contexts. Recommendation 5. Research is needed to distinguish the characteristics of nonrespondents as opposed to respondents and to assess the impact of questionnaire design and data collection procedures on the level of nonresponse. Recommendation 6. Data sets that permit good estimates of bias and variance to be made when various statistical methods of dealing with nonresponse are adopted should be made publicly available. Such data sets could be used for testing various methods of bias reduction and for assessing effects of the methods on variances. They could also be used for the evaluation of more general methods depending on models. Recommendation 7. Theoretical and empirical research should be under- taken on methods of dealing with nonresponse in longitudinal and panel surveys. Recommendation 8. Theoretical and empirical research on the effects of nonresponse on more complex methods of analysis of sample survey data, e.g., multivariate analysis, should be undertaken. Recommendation 9. A consistent terminology should be adopted for descriptive parameters of nonresponse problems and for methods used to handle nonresponse in order to aid communication on nonresponse problems. Recommendation 10. Research on response mechanisms that depend on reasons for nonresponse should be undertaken. Recommendation 11. Data on costs should be obtained and analyzed in relation to nonresponse procedures so that objective cost-effective deci- sions may become increasingly possible.

OCR for page 133
APPENDIX B 135 OTHER SELECTED RESEARCH TOPICS COMPILED BY THE PANEL Research Area / Quotation Source Theoretical Approaches to Nonresponse We conjecture that there may be a direct link between the Brick and Williams increase in efforts to contact households and refusals. Many (2013:55–56) households contacted because of the additional efforts may be more inclined to refuse precisely because of the increased contact efforts. This effect might be especially pronounced in telephone surveys, where the members of households with caller ID can see that numerous attempts have been made to contact them. If so, it is possible that the multiple attempts will predispose the household to refuse when they are finally reached.… This conjecture is consistent with our earlier suggestion that technological barriers may suppress the opportunity actually to hear the survey request. In this case, the barrier would promote refusals by increasing the rate of noncontact over time. Frustration with multiple contact attempts might also partially explain why so many RDD surveys with high nonresponse rates have low nonresponse bias. In terms of a mechanism for nonresponse, frustration with multiple contact attempts is generally not very selective and unlikely to target a particular group or subgroup. It is interesting to note that the two most prominent and Brick and Williams useful models for thinking about survey nonresponse— (2013:56) social exchange theory and leverage–saliency theory—are actually models of survey participation. They do not explicitly address the relationship between contact efforts and participation efforts. Extending nonresponse models to include the effects of contact and testing these theories might yield valuable practical advice for survey researchers. Perhaps most important in the present study is the finding Billiet et al. that the relationship between the type of respondent (2007:159) (cooperative, reluctant) and the attitudinal and background variables was not all in the same direction in all countries. This needs further research and discussion because it creates a serious challenge to any scholar who believes there is a theory of nonresponse that applies cross-nationally. Continued

OCR for page 133
136 NONRESPONSE IN SOCIAL SCIENCE SURVEYS Research Area / Quotation Source Nonresponse Bias There may be additional hidden costs to the effort to Tourangeau (2003:11) maintain nonresponse rates in the face of mounting resistance. Many survey researchers suspect that reluctant respondents may provide less accurate information than those who are more easily persuaded to take part.… Although the general conditions that produce nonresponse bias in survey means or proportions are known (the bias is a function of both the nonresponse rate and the relation between the response “propensity”—the probability that a given case will become a respondent—and the survey variables), it is not clear what circumstances are likely to yield large nonresponse biases and what circumstances are likely to yield small or negligible ones. Most of the survey literature on nonresponse has focused Tourangeau (2003:11) on its impact on means, proportions, and totals. The impact of attrition may be reduced for more complex, multivariate statistics (such as regression coefficients), but clearly more work is needed to document this difference. Another kind of study is likely to assume increasing Tourangeau (2003:12) importance in the coming years; these studies will focus on the issue of when nonresponse produces large biases and when it can be safely ignored. Like investigations of measurement error, these studies may involve disruptions of ongoing efforts to maintain response rates (perhaps even lowering response rates by design) in order to assess their impact on nonresponse bias. In addition, it will be important to demonstrate that falling response rates actually matter (at least some of the time) and to understand the impact of nonresponse on complex statistics derived from survey data. More research across a range of surveys is needed to Billiet et al. answer the question as to whether higher response rates (2007:160) decrease nonresponse bias. Indeed, in the light of our mixed results, we are not able to decide which of the two models, the “continuum of resistance model” or the “classes of nonparticipants model[,]” finds most support in our data. Further research on the differences and similarities in reasons for refusing cooperation between the two kinds of reluctant respondents (easy- and hard-to-convert refusals) and the refusals who were reapproached and who still refused to participate in a survey is needed.

OCR for page 133
APPENDIX B 137 Research Area / Quotation Source Interviewer Effects First, our results do not go far in explaining the mechanisms Sinibaldi et al. through which interviewer experience is related to (2009:5968) cooperation. Since experience has a strong effect, further exploration of the mechanisms by which it occurs is of interest. Second, we have not addressed the question of whether experience has a positive effect due to learning or selective drop-out of less successful interviewers. Third, we believe that the lack of effect of inter-personal skills is related to problems in measuring these, rather than to the fact that they are not relevant. The question then is how such skills may be measured more successfully. What is needed next are studies which address some Campanelli et al. of the other aspects of the doorstep interaction such as (1997:5-4) the intonation of the interviewers voice and non-verbal behaviour and the other various intangible things which help to determine the outcome of a request for participation. It would also be useful to try to separate out the subtleties that make a professional interviewer a professional interviewer. The extent to which variation in interviewer practices, Maynard et al. sample persons’ interactional moves, and the interrelation (2010:810) between these practices and moves have measurable effects on response rates awaits further, quantitative investigation. Nonetheless, this study highlights two challenges for such research. First, if practices are effective because of their deployment in particular contexts, then their effectiveness can be assessed only by experimental designs in which that context is considered. One cannot simply assign some interviewers to do presumptive requests and others to do cautious ones; instead, properly varying the presumptiveness and cautiousness of requests depending on the circumstances may be optimal. Interviewers would need to be trained to recognize these situations—and to do so very quickly. Second, observational studies of practices need to be careful not to confuse the influence of an interviewer’s practices on a sample person with the influence of a sample person’s behavior on an interviewer. The influences of interviewer behavior, as well as interviewer Durrant et al. personality traits, are not yet well understood. It seems (2010:25–26) advisable to measure interviewer behavior at the interaction level rather than the interviewer level. To better understand the process of establishing cooperation, interviewer call records need to be investigated, which only more recently have become available. It also seems advisable to control for previous interviewer performance, which requires survey agencies to record and use these data. A largely unexplored area is interviewer effects in longitudinal surveys. Continued

OCR for page 133
138 NONRESPONSE IN SOCIAL SCIENCE SURVEYS Research Area / Quotation Source Given the apparent importance of the perception and van der Vaart et al. interpretation of voice characteristics, an alternative (2006:497) method is to focus on the perceived interviewer approaches. Since there are probably many combinations of voice characteristics that can convey a similar interviewer approach (e.g., there are multiple ways to express authority), this method might be more fruitful. In that case, more research is needed into how interviewer approaches—as likeability, authority, and reliability—might be expressed and perceived during the introductory part of a telephone interview, and in which conditions they are effective in enhancing cooperation rates. In general, more work is needed to assess whether certain West and Olson types of survey items are more or less susceptible to (2010:1022) nonresponse error variance or measurement error variance among interviewers. Interviewer incentives are ill-understood and have received Peytchev et al. little attention in the research literature, relative to (2010:26) respondent incentives. The mechanisms through which they may act on interviewer response rates and nonresponse bias are possibly different from those that act on respondents, as interviewers and respondents have very different roles in the social interaction at the doorstep. Further research is needed to explore how, and under what circumstances, interviewer incentives could help achieve survey goals. There is also evidence that interviewer motivation is a major Roberts (2005:4) contributing factor in maintaining respondents’ interest in a survey and preventing break-offs. So studies of interview length should also explore the burden placed on interviewers in different modes and how this impacts on data quality. Mixed Modes Another question for future research is the relative power Dillman et al. of following the attempts to obtain Web and IVR responses (2008:17) with a mail survey in Phase 2, rather than telephone. In many ways the telephone attempts during Phase 2 were similar to the initial contacts, i.e., both involved interaction by phone. It is reasonable to expect that switching to mail at this stage would have had a much greater impact on improving response to these treatment groups, but remains to be tested experimentally.… Using an alternative mode that depends upon a different channel of communication, i.e., aural vs. visual, to increase response may also introduce measurement differences issues that cannot be ignored. Understanding the basis of these differences should be a high priority for future research.

OCR for page 133
APPENDIX B 139 Research Area / Quotation Source Mixed or multiple mode systems are not new, but new De Leeuw (2005:249) modes emerge and with them new mixes. This means that we have to update our knowledge about the influence of modes on data quality. We need comparative studies on new modes and mode effects, and preferably an integration of findings through meta-analysis. Multiple mode contact strategies are employed to combat De Leeuw (2005:249) survey nonresponse. Still we need more research on the optimal mixes, preferably including other indicators besides response rate, such as bias reduction and costs. Adjustment or calibration strategies for mode mixes are still De Leeuw (2005:250) in an early phase, and more research is needed. Not much is currently known about people’s preferences Roberts (2005:3) for different data collection modes. What modes would respondents prefer to use when participating in a survey? Meta-analyses of mode preference data have found that people tend to “over-prefer” the mode in which they were interviewed, but when mode of interview is controlled for, there is an overall preference for mail [surveys]. It is likely that these findings are now out of date, yet the apparent popularity of the Internet as a mode of data collection may well reflect an overall preference among respondents for self-completion. More research into public attitudes to data collection modes would shed light on this issue and might help guide survey designers in making mode choices. Offering different survey agencies/countries or respondents Roberts (2005:4) a choice from a range of data collection modes will be a realistic option only once it is known that a questionnaire can practicably be administered in each of the modes on offer.… Not enough is known, however, about the extent to which modes are differentially sensitive to questionnaire length (and people’s tolerance of long interviews), so any survey considering the feasibility of mixing modes will need to examine this problem. [Some] survey organisations impose a limit on the permissible length of phone interviews (e.g., Gallup’s “18 minute” rule). But research has shown that people’s willingness to respond to long surveys depends on their motivation and ability to participate which, to a large extent, will vary by survey topic. There may also be cultural variation in tolerance of interview length (e.g., norms regarding the duration of phone calls), and these should be investigated. Continued

OCR for page 133
140 NONRESPONSE IN SOCIAL SCIENCE SURVEYS Research Area / Quotation Source We need to understand better the non-response mechanisms Roberts (2005:7) associated with each mode. For example, non-response in self-completion surveys is often linked to variables of interest. A weakness of face-to-face interviewing is that we get greater non-response in urban populations than in rural ones. Each mode has weaknesses, and we need to be aware of what those weaknesses are. Cell Phones In terms of nonresponse, cell phone response rates trend American Association somewhat lower than comparable landline response rates, for Public Opinion but the size of the gap between the rates for the two frames Research (2010a:109) is closing. This is thought to be due to landline response rates continuing to drop faster than cell phone response rates. Research needs to be conducted to more fully understand the size and nature of differential nonresponse in dual frame telephone surveys and the possible bias this may be adding to survey estimates. Future research needs also to seek a better understanding of how dual service users (those with both a cell phone and a landline) can best be contacted and successfully interview via telephone. Responsive Design While we were quite successful in predicting response Peytchev et al. outcome prior to the study, surveys vary in the amount of (2010:26) information that is available on sample cases. Exploring external sources of information is needed, particularly for cross-sectional survey designs that do not benefit from prior wave data and may also lack rich frame data. Similarly, more research will be needed on how to apply these data prior to any contact with sample cases. Two alternatives are to apply model coefficients from similar surveys, or to estimate predictive models during data collection as proposed under responsive survey design (Groves and Heeringa, 2006). New and effective interventions for cases with low response Peytchev et al. propensities are needed in order to succeed in the second (2010:26) step of our proposed approach to reducing nonresponse bias. Such interventions are certainly not limited to incentives as their effectiveness varies across target populations, modes of data collection, and other major study design features. Further research is needed into the whole sequence of the Wagner (2008:76) survey process and how the protocols at each stage (e.g., screening) interact with those applied on other stages (e.g., refusal conversion or interviewing) of the process. The dynamic treatment regimes approach offers a roadmap for [how] this research might be conducted. The results developed here suggest that such a research program could be successful.

OCR for page 133
APPENDIX B 141 Research Area / Quotation Source Incentives Relatively few studies have examined the effect of incentives Singer and Ye on sample composition and response distributions, and most (2013:134) studies that have done so have found no significant effects. However, such effects have been demonstrated in a number of studies in which the use of incentives has brought into the sample larger (or smaller) than expected demographic categories or interest groups. Clearly, there is still much about incentives that is unknown. Singer et al. In particular, we have not examined the interaction of (2000:187) respondent characteristics such as socioeconomic status with incentives to see whether they are particularly effective with certain demographic groups. Geocoding telephone numbers in the initial sample might permit analysis of such interaction effects (cf. King [1998], who applied a similar method to face-to-face interviews in Great Britain). And we need better information on the conditions under which incentives might affect sample composition or bias responses. Such analyses should receive high priority in future work. The number of incentive experiments that could be designed Singer (2000:241) is legion; unless they are guided by theory, they will not contribute to generalizable knowledge.… One question often asked is how large an incentive should be for a given survey. The issue here is the optimum size of an incentive, given other factors affecting survey response. If experiments varying the size of the incentive are designed in the context of a theory of survey participation that allows for changes in motivation over time, some generally useful answers to this question may emerge. In the absence of such theoretically based answers, pretesting is the only safe interim solution. Research is also needed on how paying respondents for Singer (2000:25) survey participation affects both respondent and interviewer expectations for such payments in the long run. Research is needed on the conditions under which incentives Singer (2000:25) not only increase response rates but produce a meaningful reduction in nonresponse bias. Because they complement other motives for participating in surveys—such as interest in the survey topic, deference to the sponsor, or altruism—it is reasonable to hypothesize that incentives would serve to reduce the bias attributable to nonresponse. Whether the use of incentives for this purpose is cost-effective is less easily answered, however, and research is needed on this topic, as well. Continued

OCR for page 133
142 NONRESPONSE IN SOCIAL SCIENCE SURVEYS Research Area / Quotation Source Weighting and Nonresponse Adjustment Including many auxiliary variables and using the Brick and Jones fullest cross-classification of these variables possible in (2008:72) the weighting will quickly result in small numbers of respondents in at least some of the weighting cells. Guidance on appropriate cell sizes for calibration weighting is very limited. The appropriate cell size is a trade-off between the potential reduction in nonresponse bias associated with increasing the information in calibration weighting and the potential increase in the variance and ratio biases of the estimates. More research is needed in this area. Another area that requires more research is the effect of Brick and Jones nonresponse on multivariate methods such as measures of (2008:72) association and linear and logistic regression parameters when the survey weights are used to compute these measures. The analytic results for odds ratios imply that the bias in this type of statistic could be sensitive to varying response propensities. Simulation studies on these multivariate statistics could prove very enlightening. The challenge of weighting adjustment, for survey Kreuter et al. researchers and practitioners, lies in the search for an (2010:405–406) appropriate set of auxiliary variables that are predictive of both response probabilities and survey variables of interest. We encourage survey researchers to engage actively in identifying an appropriate set of auxiliary variables in developing non-response adjustment weights. This should include identifying measures at the design stage that can be obtained on both respondents and non-respondents and that are good proxy variables for one or multiple survey variables. In the past, attention was often focused on finding variables that are associated with response although small R2-statistics are very common in response propensity models…. The results of this paper show that a renewed focus on correlates of the key survey outcome variables is warranted. An avenue that is worth exploring is statistics derived from call record data or other types of paradata that were not discussed here.

OCR for page 133
APPENDIX B 143 Research Area / Quotation Source Paradata Regarding further research, we make several suggestions. Bates et al. First, we suggest looking to new technologies to further (2010:103) assess paradata validity and quality. If possible, the use of computer-assisted recorded interviewing (CARI) might be implemented. Ideally, we could record the pre-interview door-step interactions so we could have the “truth” against which to compare [content history instruments (CHI)] entries. However, given the legal and policy requirements to obtain informed consent prior to using CARI, this may prove impossible. An alternative is to have trained observers shadow interviewers, record their own versions of CHI, and then compare their records and the interviewer’s. Second, we recommend bringing interviewer characteristics into the equation when assessing paradata quality (e.g., years of experience, gender, education). Since recording interviewer– respondent interactions is a rather subjective undertaking, interviewers are undoubtedly a source of systematic variance. To date, there is very little research regarding interviewer impact on the collection of paradata. We encourage future work in this area that might include Kreuter and Kohler indicators for time and part of the day or other features (2009:224) that would be correlates of respondent attributes related to contactability and cooperation. This paper did not consider the measurement error Kreuter et al. properties of the interviewer observations and record (2010:405) variables. We made a simplistic assumption that there is no measurement error in those variables. Of course, this assumption is debatable in the real world. Future research is needed to examine the effect of the potential measurement error in auxiliary variables on survey estimates and on the bias–variance trade-off. Although it will be difficult to do so, research is also needed on the presence and effect of selective measurement error, e.g., if measurement error in the auxiliary variables is correlated with response. Administrative Records Administrative records are another avenue agencies are National Research pursuing for use as sampling frames, as survey benchmarks, Council (2011:7), as sources of auxiliary data for model-based estimates, summarizing and for direct analysis. This is a promising area for future workshop research, Abraham said, but she added a word of caution presentation by about treating administrative records as the “gold standard” Katharine Abraham of data, because little is known of their error properties. (University of Maryland, College Park) Continued

OCR for page 133
144 NONRESPONSE IN SOCIAL SCIENCE SURVEYS Research Area / Quotation Source For many years, members of the statistical community have National Research said that administrative records can and should be used more Council (2011:41– fully in the federal statistical system and in federal programs. 42), summarizing The use of administrative records in the Netherlands and workshop other countries gives a good flavor of the kinds of things presentation by the statistical system can envision doing in the United States Rochelle Martinez to varying degrees. There are also areas, however, in which (U.S. Office of substantial work has already been done in the U.S. context. Management and Most notably, administrative records have been used in Budget) economic statistical programs since the 1940s. There are also good examples of administrative data use with vital statistics, population estimates, and other programs across several federal statistical agencies. [Another] barrier is administrative data quality. Although National Research they are not perfect, with survey data, agencies have the Council (2011:44), capability to describe and to understand the quality of what summarizing they have. In other words, there are a lot of measurement workshop tools for survey data that do not yet exist for administrative presentation by records. Some have assumed that administrative data are Rochelle Martinez a gold standard of data, that they are the truth. However, (U.S. Office of others in the statistical community think quite the Management and opposite: that survey data are more likely to be of better Budget) quality. Without a common vocabulary and a common set of measurements between the two types of data, the conversation about data quality becomes subjective. Another significant data quality issue for statistical agencies National Research is the bias that comes with the refusal or the inability to Council (2011:44), successfully link records. In addition to the quality of the summarizing administrative data as an input, the quality of the data as workshop they come out of a linkage must be considered as well. presentation by Rochelle Martinez (U.S. Office of Management and Budget) For the future, Trépanier said, using administrative data National Research to build sampling frames is of particular interest. There Council (2011:49– is the risk of coverage error in using an administrative 50), summarizing database in constructing a frame, but if it is done in the workshop context of using multiple other frames and calibration to presentation by Julie correct coverage error, this is probably less of an issue. The Trépanier (Statistics ideal goal is a single frame, which is the approach used in Canada) building Statistics Canada’s Address Register, but this does not preclude the inclusion of auxiliary information. A single frame would allow for better coordination of samples and survey feedback, she said.

OCR for page 133
APPENDIX B 145 Research Area / Quotation Source For data collection, one of the goals related to administrative National Research data is to enable tracing. Statistics Canada wants to Council (2011:50), centralize the tracing process leading to the linking of all summarizing administrative data sources to make available the best workshop contact information possible. This will require substantial presentation by Julie effort, including a process to weigh the quality of the Trépanier (Statistics different sources and determine what contact information is Canada) most likely to be accurate. Another goal for administrative data could be to better understand the determinants of survey response and improve data collection procedures based on this information. For example, administrative data can provide guidance on preferred mode of data collection if one can assess whether persons who file their taxes electronically are also more likely to respond to an electronic questionnaire. Statistics Canada has been successful in using substitution National Research of income data from tax records, and this is likely to Council (2011:50), be continued. It is yet unclear, however, whether other summarizing information is available that could replace survey data. workshop Investigating these options is done with caution because of presentation by Julie the risk discussed. There is also the problem of ensuring Trépanier (Statistics consistency between survey and administrative data across Canada) variables. Administrative data can also assist researchers in better National Research understanding nonresponse bias and the impact of lower Council (2011:50), response rates. Finally, they can help both reduce the volume summarizing of data collected in surveys and improve estimation. Now workshop that Statistics Canada has the omnibus record linkage presentation by Julie authority in place, exploring all of these options has become Trépanier (Statistics a much easier process. Canada)