Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
7 Methodological Research and Evaluation One of the undoubted strengths of the SIPP program, including its forerun- ner, the Income Survey development Program (ISDP), has been the extent of research on the quality of the data and ways to enhance quality (and efficiency) through improving the design and operation of the survey. In- deed, a strong research and evaluation component was essential dunug the planning and start-up phases because of SIPP's complex nature and broad scope. Looking to the future, we believe that SIPP will continue to require methodological research on many aspects of the program.2 The Census Bureau will need information in the near teens for many details of the pro- posed redesign. Subsequently, the Bureau will need information on the impact of the redesign to guide research and experimentation directed to further improvements in the survey. In addition, users will need continuing information on data quality to make the most appropriate use of the survey information. 1To assist it in determining priorities for research and evaluation and in designing specific projects, the Census Bureau has consulted with the members of the Working Group on Techni- cal Aspects of SIPP, sponsored by the Survey Research Methods (SUM) Section of the Amer- ican Statistical Association (ASA). 2We note also the importance of research on analytical measurement issues, such as those discussed in Chapter 6 (e.g., the definition of spell length and whether to include all spells observed in a panel in an analysis of duration), and we believe strongly (see Chapter 8) that the Census Bureau should give increased attention to analytical issues for SIPP. 202
METHODOLOGICAL RESEARCH AND EVALUATION 203 In this chapter we first briefly review the scope and accomplishments to date of the SIPP methodological research program at the Census Bureau.3 We then outline research strategies and summarize priority topics (many of which we discuss in other chapters) for which further methodological work is indicated, under two main headings: research to inform and evaluate the redesign and continuous monitoring of error levels for the benefit of ana- lysts both inside and outside the Bureau. We conclude with a detailed discussion of a recently inaugurated program of cognitive research on the SIPP questionnaire. This innovative work shows great promise to improve data quality although it presents difficult questions of implementation and integration with other planned improvements for SIPP, such as computer- assisted personal interviewing (CAPI). RESEARCH TO DATE Topics From the very early days of designing a new income survey, the Census Bureau has been concerned with identifying and conducting research on a wide range of methodological issues that were critical for SIPP to achieve its goals of providing improved data on income and program participation.4 The ISDP investigated a number of important topics, including: · the length of the recall period; · respondent rules (self- versus proxy reporting); · alternative questionnaire designs (e.g., short versus long forms of the income receipt and amounts questions; household screening format versus person-by-person format); · mover follow-up rules; and · the definition of longitudinal units. The abrupt termination of the ISDP program in 1981 left many issues in venous stages of resolution. Initial methodological research and evaluation at the Census Bureau during the first 2 years of SIPP concentrated on the following topics: 3We note that important methodological research and evaluation on SIPP has been conduct- ed by analysts outside the Census Bureau; examples are Curiin, Juster, and Morgan (1989); Doyle and Dalrymple (1987); and Vaughan (1988). 4This section draws heavily on Committee on National Statistics (1989:Ch. 7), which, in turn, benefited greatly from a presentation by Daniel Kasprzyk to the Committee on National Statistics subcommittee on SIPP. See the SIPP Quality Profile (Jabine, King, and Petroni, 1990) for references and summaries of findings for methodological studies conducted on SIPP; see also Kasprzyk (1988b) and Petroni, Huggins, and Carmody (1989). David (1983) provides references to methodological studies conducted on the ISDP.
204 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION · wave nonresponse and its treatment; · field considerations, including the impact of respondent rules in gen- eral, respondent rules for students, and mover follow-up rules; · the "seam problem" (i.e., the observed tendency to report more tran- sitions between months marking the boundary between two interviews than between months included within a single interview); · comparisons of survey responses with administrative records data; and · issues involved in developing and analyzing longitudinal data from SIPP, including imputation, estimation, and household and family-concepts. As more SIPP data became available for analysis, other issues pertinent to the quality and utility of the information emerged and gained in impor tance: · the pervasiveness of the seam problem; · sample attrition; · problems with the independent collection of industry and occupation data from one wave to the next; · the lack of baseline data (e.g., the lack of information in the first interview for program participants on when the participation spell began); · the lack of data on employer-provided benefits; and · the need to provide users with measures of quality through such means as comparisons of SIPP cross-sectional estimates with independent sources. Still later, yet additional methodological concerns surfaced, including: · problems with constricting program eligibility measures given that key data were scattered across topical modules; and · the possibility of time-in-sample bias (i.e., systematic differences in responses by individuals between later and earlier interviews). At the same time, budget cutbacks, which necessitated restricting sample size and the number of interviews, motivated research on ways to compen- sate for diminished sample size and to control costs. Methods and Results The Census Bureau has used a range of techniques for methodological re- search and evaluation, including: · small-scale and large-scale field experiments with changes in proce- dures or question wording (experiments were conducted with telephone in- terviewing to reduce costs, gifts to reduce attrition, collecting data on em- ployer-provided benefits, different procedures to reduce the seam problem,
METHODOLOGICAL RES~RCH AND EVALUATION 205 and providing respondents with prior-year asset responses to improve re- porting of changes in asset holdings); · comparisons of SIPP responses with administrative records on an individual match basis (most notably, the record-check study [see Marquis and Moore, 1989, 1990a, l990b]), · comparisons of aggregates from SIPP with those from other surveys and administrative records; · internal analysis of SIPP data (e.g., an evaluation of the effective- ness of the cross-sectional weights in compensating for attrition by compar- ing estimates at wave 2 for all wave 2 respondents and only those respon- dents who remained as of wave 6 [see Petroni and King, 1988; King et al., 19903~; · analysis of data from reinterviews of SIPP respondents (e.g., Hill, 1989); · simulation studies (e.g., simulating alternative schemes for o~rersampling subgroups of policy interest); and · most recently, application of cognitive research techniques (e.g., one- on-one sessions in which the interviewer asks the respondent to think aloud in answering each question) to understand respondents' perceptions of the questionnaire (see below). As a result of the research and evaluation program, the Census Bureau has instituted some changes in procedures and questionnaire content for SIPP. For example, a second administration in each panel of the assets and liabilities module was dropped, as was the missing wave module,5 because of adverse research findings about the quality of the data. Research results also contributed to the recent decision to switch from maximum personal to maximum telephone interviewing for SIPP, as well as the strategy adopted for oversampling low-income households beginning in the 1995 panel. In other instances, research has not produced clear findings and hence is con- tinuing: for example, various steps to reduce the seam problem have thus far had little effect. In still other cases, resources have not been available to implement findings: for example, no changes have yet been made to impu- tation and weighting procedures to adjust for biases found in research. Whatever the outcome for SIPP operations, in almost all instances re 5The missing wave module was designed to fill in information for people who had missed the preceding wave but were interviewed at the wave prior to that one. The module, which was administered beginning in wave 4 of the 1984 panel and discontinued midway through the 1986 panel, asked an abbreviated set of questions on labor force status, program participation, income receipt, and asset ownership for the reference period covered by the preceding wave. Evaluation determined that the number of transitions reported was much smaller than predicted and that the additional information obtained did not appear to justify the respondent burden and cost of collection (Jabine, King, and Petroni, 1990:40).
206 THE SURVEY OF INCOME AND PROGMM PARTICIPATION search has generated valuable information for users on the possibilities and limitations in the data. Research results have been disseminated through the SIPP Working Paper series. Also, to bring together available informa- tion on data quality, the Census Bureau put a substantial effort into develop- ing the SIPP Quality Profile. (The ASA/SRM working group was a prime mover behind this project.) The first edition of the profile was published as a SIPP Working Paper (King, Petroni, and Singh, 1987~; the second edition (labine, King, and Petroni, 1990) represents a reorganization and significant expansion of the material. The SIPP Qualify Profile brings together what is known about method- ological problems with SIPP at each stage from data collection through data dissemination and also includes aggregate comparisons with other data sources. The profile is of great value for data users, for survey methodologists, and for the Census Bureau in determining the agenda for further research and evaluation. Indeed, the profile sets a standard for the field that other sur- veys would do well to emulate. REDESIGN OF SIPP SIPP is scheduled to undergo a major redesign in the mid-199Os. The Census Bureau expects to decide on the basic elements of the redesign by the end of 1992, but many details will be worked out later. It will be important to have a targeted research program to provide inflation to help resolve outstanding issues and to ensure the smoothest implementation possible of the new design. When it is in place, the redesign should achieve important improvements in many aspects of survey operations and in the quality and utility of the data; however, it may also have untoward effects. Again, research will be needed to identify the successes of the redesign and to suggest ways to handle any problems that anse. In this section we discuss the components of a program of methodologi- cal research that we believe the Census Bureau should put in place to in- form and evaluate the SIPP redesign. These components include: research to improve the format and wording of the questionnaire; research targeted to other aspects of the redesign (e.g., implementation of CAPI); research on issues of estimation and data use (e.g., weighting and imputation) in light of the redesign; research to evaluate the success of the redesign; and, finally, a quick-response capability to address unanticipated problems in and after implementation. Questionnaire Content In Chapter 3 we propose a number of content changes to SIPP for ex- ample, adding a few questions about the respondent's family of origin,
METHODOLOGICAL RESEARCH AND EVALUATION 207 obtaining additional detail about household relationships, and ascertaining information needed to determine eligibility for major assistance progrmns on a more frequent basis. (Correspondingly, we identify some topics that might be scaled back in detail.) We also recommend experimental work to develop measures of protection against economic risk (e.g., access to credit). In addition, we strongly urge a total overhaul of the questionnaire content related to measurement of assets in SIPP. We are sure that the community of users will have suggestions for the questionnaire as well. We understand that the Census Bureau hopes by the end of the year to decide on the content changes in the core questionnaire that will be imple- mented as part of the redesign It appears quite reasonable to adopt this schedule for determining topic areas and the general level of detail in the core, particularly given that the staff who are working to design the CAPI and database management systems for SIPP need this information. How- ever, we believe that the Census Bureau should not try to lock in the precise format and question wording before thorough testing of proposed changes. Such testing is critical in view of the inevitable tendency in any ongo- ing survey program to resist frequent questionnaire changes. Although we hope and expect that the conversion of SIPP to CAPI and database manage- ment system technology will make it easier to modify content and format as needed, it is still true that questionnaire changes will not and should not be made lightly. The occasion of the redesign offers the opportunity to add, delete, or modify a large number of questions. Such an opportunity will not likely occur again for many years. Hence, it is incumbent upon the Census Bureau to evaluate proposed content changes as thoroughly as possible be- fore determining the final format and question wording for the redesign. We realize that time is short, particularly given the need to implement the questionnaire in CAPI and a new database management system. We suggest that, in addition to standard pretests, the Census Bureau make use Of twn means of questionnaire testing and evaluation that we believe could ~ ^ ~, ~ = _ be implemented rapidly. First, we suggest that the SIPP staff who use the core data and who will redesign the questionnaire work closely with the researchers who are in charge of the Census Bureau's program to apply the results of cognitive research to developing improved methods for collecting higher quality data in SIPP (see below). That program is testing a very different set of inter- viewing procedures (e.g., more use of records by respondents, conducting the interview for a household on a group basis) and a very different ques- tionnaire format, in which many of the questions are free-form (e.g., the respondent is asked to name income sources in any order and supply amounts as received rather than as fixed monthly totals). As discussed below, we are highly supportive of this program but skeptical that it can be carried out on a sufficiently rapid time schedule or with sufficient evaluation to permit the
208 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION new procedures and format to be incorporated for the redesign. Recogniz- ing that it may be difficult to obtain the full benefits from this program unless it is adopted as a package, we nonetheless believe that the format and wording of a more standard SIPP questionnaire could likely be improved by the results of the cognitive research program.6 We urge the SIPP design and analysis staff to make as full use as possible of the findings of this program in determining the final form of the questionnaire. Second, we suggest that the Census Bureau use small-scale record- check studies, particularly forward record-check studies, as a vehicle for questionnaire testing and evaluation. Forward record checks would involve drawing samples of a few hundred cases each from relevant administrative record sources (e.g., program records, employer records, or tax records) and administering the SIPP interviews to each case. There are several advan- tages of this approach for questionnaire research. The samples can target population groups of special relevance for SIPP, such as recipients of such programs as AFDC. The availability of administrative data for the samples provides the opportunity to evaluate the quality of the responses. For ex- ample, cognitive research has demonstrated that program recipients may fail to report their benefits or may describe them as something else because they do not recognize the name of the program as it is listed in the questionnaire. A forward record check would provide a ready means to evaluate the effec- tiveness of alternative ways of identifying programs for respondents. Another advantage of forward record checks for questionnaire research purposes is that, for many research objectives, it is not necessary to obtain a nationally representative sample. lIence, in the case of state-administered programs, for which there is wide variation in the quality and accessibility of administrative records, the samples could be selected from those states with the best systems.7 Finally, forward record checks can be implemented on a timely basis because there is no need for an after-the-fact match of the administrative and survey data. We note that the Census Bureau's cognitive research program is itself using a forward record-check approach, drawing small samples from records for four programs and one employer in Milwaukee County, Wisconsin, which will be the site of a major evaluation study of the new interviewing proce- dures and questionnaire format (Bureau of the Census, 1992b; see also below). We note further that the ISDP experienced considerable success 6For example, it might be possible to allow respondents to supply information about amounts in a somewhat freer format once recipiency has been ascertained in the standard fashion (e.g., specifying the date and amount of each payment, which could then be converted by the data processing system to monthly values). 7However, for programs for which eligibility rules, regulations, and administrative practices vary widely across states (e.g., AFDC), record checks that are limited to a few states may resolve reporting issues only for those states and not others.
METHODOLOGICAL RESEARCH AND EVALUATION 209 with forward record checks for evaluating the ability of the survey ques- tionnaire to elicit accurate reports of program participation and benefits (see Kasprzyk, 1983~. We urge the Census Bureau to give high priority to using forward record-check studies for refining the questionnaire that is imple- mented as part of the SIPP redesign. However, forward record-check studies cannot assess the effects of false reports of program participation (or other behaviors), since this would re- quire drawing samples of known nonparticipants of the program under study. Since false positive errors are rare in SIPP (see Marquis and Moore, 1989, 1990a), large samples would be required to obtain a sufficient number of false positive responses to study the causes and remedies for overreporting in SIPP. More research is needed on the design of these types of record- check studies. For example, sampling efficiency could be improved if the sample could target groups of nonparticipants who are more prone to misreport program participation, such as people who experience frequent transitions into and out of programs. Other Aspects of the Redesign Other aspects of the SIPP redesign for which it would be useful to conduct methodological research (including design changes that could be appropri- ate to implement somewhat later on) are the length of the recall period, oversampling based on screening, implementation of CAPI, and telephone interviewing. We note that forward record-check studies, in addition to supporting questionnaire research, could well be used to evaluate other as- pects of the redesign. For length of recall period, we decided not to recommend a change in the current 4-month recall period for SIPP because of the possible adverse effects on the quality of the monthly information, which is critical for so marry policy and research uses of SIPP. However, a move to 6-month recall could permit an increase in both the sample size and length of each panel and might reduce the effects of attrition for longer panels. The literature does not provide clear guidance on the pros and cons of 4-month versus 6- month recall for SIPP. Hence, we urge the Census Bureau to give priority to research on this issue.8 Research is also needed to investigate recall period effects for cognitively designed interviews (see below), since memory effects for these types of interviews may be quite different from those for the traditional SIPP interview. We do not consider it appropriate to change the recall period for the redesign. However, if the research results indicate that a 6-month recall Further analysis of the SIPP record-check study could perhaps contribute to understanding of recall effects on the quality of the monthly data.
210 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION period would provide data of acceptable quality, then the Census Bureau should give serious consideration to changing the survey design to some- what longer and larger panels without waiting until the next major sched- uled redesign. We suggest that the efficiency of the proposed oversampling of the low- income population in SIPP, which the Census Bureau plans to implement by using information from the 1990 census, might be improved by conducting a screening interview close to the time of wave 1 for each panel. Research needs to be conducted on ways to implement the screening approach, to minimize the costs and possible difficulties of this approach for field opera- tions, and to develop better estimates of the reductions in sampling errors, compared with those in the Census Bureau's proposal. As the Census Bureau moves toward CAPI for SIPP, methodological research is needed to evaluate and understand CAPI's effects on the inter- view, the respondent, the interviewer-respondent interaction, and ultimately, the distribution of measurement error for SIPP items. In addition, more should be learned about the cost-error tradeoffs of using CAPI for SIPP. Also, research is needed to develop methods that take full advantage of the capabilities of CAPI (e.g., on-line editing, computer-directed probing, and interviewer help screens) that have the potential for reducing measurement errors. As the development of a CAPI system for SIPP progresses, research should proceed simultaneously on the quality differential between CAPI and paper-and-pencil methods for SIPP. As part of the plan to convert to CAPI, the Census Bureau will need to consider the appropriate role of personal versus telephone interviewing in SIPP. Research is needed on the implications for costs and data quality of continuing the current mode of maximum telephone interviewing in waves 3-5 and 7-8 versus reverting to heavier use of in-person interviews. A particular concern is the feasibility of using recently developed cognitively based interviewing procedures and questionnaire formats over the telephone. The Census Bureau's evaluation of telephone versus in-person inter- viewing to date (from a 1985-1986 telephone experiment) does not provide sufficient information for an informed decision. For example, sizable dif- ferences for many items were not statistically significant, indicating inad- equate power for the comparisons. The telephone experiment also did not obtain adequate information to assess the cost implications of different in- terview modes (because only the designated interview mode not the actual mode was recorded).9 More needs to be learned in order to evaluate the most cost-effective allocation of personal visits and use of the telephone in an environment of computer-assisted interviewing. Also, work should be awe presume the Census Bureau is obtaining cost information from the use of maximum telephone interviewing for the current SIPP panels.
METHODOLOGICAL RESEARCH AND EVALUATION 211 done to determine the possible savings in cost and improvements in quality that could be effected by using centralized computer-assisted telephoning for some types of SIPP interviews. Estimation and Data Use Much of the SIPP methodological research program has focused on under- standing and improving data quality at the source, for example, by means of changes in questionnaire wording and interviewing procedures. Although this focus is appropriate, research is also needed on how best to treat the quality problems that remain at the points of data analysis and use. In this section we discuss needed research on weighting and imputation that is particularly important given the changes that we propose in the design of SIPP namely, longer panels and less frequent introduction of new panels. We also discuss the problem that undercoverage in SIPP varies across popu- lation groups, which has implications for weighting adjustments as well as for improved field procedures. Population undercoverage affects household surveys generally, but it may be particularly troublesome for SIPP given the evidence that it is low-income people who are most likely to be missed. Weighting and Imputation To date, Census Bureau staff and outside analysts have examined weighting procedures to compensate for sample selection and attrition and imputation procedures to adjust for item nonresponse. Both cross-sectional and longi- tudinal procedures have been assessed. It is clear that the current cross- sectional weights do not adequately adjust for differential attrition by such characteristics as income level (Petroni and King, 1988; King et al., 1990~. It appears likely as well that the weights do not adequately compensate for differential undercoverage of population groups in the survey (see below). There is also evidence that the current cross-sectional imputations do not adequately reproduce known relationships between income, assets, and pro- gram participation (Doyle and Dalrymple, 1987; Allin and Doyle, 1990~. However, the research in these areas has not been carried to the point of identifying optimal revisions to the Census Bureau's weighting and imputa- tion programs. We urge that such research be conducted and that appropri- ate changes in the Bureau's procedures be implemented on the basis of the results. One avenue to pursue is research on the benefits of more extensive use of wave 1 income and program participation variables to adjust for attrition in subsequent waves. The need for further research on longitudinal weighting and imputation is even more critical. The current longitudinal weights reduce the available sample size for analysis because only people with data from all waves of an
212 THE SURVEY OF INCOME AND PROGRAM PAR TICIPATIOlI entire panel (or a calendar year) are given positive weights. The current longitudinal imputations are implemented only for selected variables and only after all waves of a panel are complete.~° Further investigation of improved weighting and imputation procedures, both cross-sectional and longitudinal, is especially important in the context of the proposed redesign. The less frequent introduction of new panels will require research on more effective cross-sectional weighting adjustments to compensate for attrition bias in estimates for the "off,' years (every other year), when a new panel is not in the field. Research will be needed as well on the extent of bias in the off-year estimates (and ways to compensate for the bias) that results from loss of coverage for people who enter the SIPP universe (e.g., by leaving institutions) and have no antecedents in the previous on-year population. The increase in panel length will also require research on more effective longitudinal weighting procedures to minimize the loss of sample cases. The development of imputation procedures to supply data for waves that are missing in their entirety could help with this problem. The use of imputation seems particularly promising for cases with only one or two missing waves, in which the missing waves are bounded by interview data for the preceding and succeeding waves. For another longitudinal estimation issue, we urge the Census Bureau to conduct research on appropriate weighting for analyses of spell duration (spells of low income or program participation) that use such standard sur- vival analysis procedures as the Kaplan-Meier and proportional hazards modeling approaches. Analysts using these methods often assume that the survival probability of people who remain in the sample is the same as those who do not (within subgroups or given a common set of covariates). Armed with this assumption, analysts proceed to make use of all sample cases up to the point of attntion. They typically ignore the survey weights because there is no appropriate set of weights available. An alternative is to restrict the analysis to cases with complete data. Then the analyst can readily adapt the Kaplan-Meier and proportional hazards procedures to in- corporate survey weights that adjust for nonresponse. However, this ap- proach fails to use all of the cases. Research is needed into alternative imputation and weighting strategies that make fuller use of the cases with 10These imputations replace previous cross-sectional imputations for item nonresponse; they are not used to supply data for waves that are missing in their entirety. 1 1Under the proposed design with 4-year panels introduced every other year, cross-sectional estimates with maximum sample size for "on" years can be based on the first year of a new panel and the third year of the previous panel; cross-sectional estimates with maximum sample size in "off" years can be based only on the second year of the most recent panel and the fourth year of the previous panel. 12Work is currently under way on this topic through a joint statistical agreement between the Census Bureau and the University of Michigan.
METHODOLOGICAL RESEARCH AND EVALUATION 213 incomplete data for survival analysis and enable analysts to use a suitable set of weights. Finally, looking to the implementation of CAPI and a new database management system for SIPP as part of the redesign, it is important for the Census Bureau to press forward with the research that the data processing staff have begun on ways to integrate cross-sectional and longitudinal pro- cessing. Specifically, research is needed to determine ways to improve imputation for both item and wave nonresponse on an ongoing basis, through timely use of information from pnor and subsequent waves. In our view, the goal for the future should be to replace the current wave-specific pro- cessing with a system that makes use of all available information for a stream of data for each sample case in a manner that supports cross-sec- tional and longitudinal estimation on a consistent basis. To support re- search and development work in this area, the database management system chosen for SIPP (as noted in Chapter 5) should permit ready implementa- tion of alternative imputation procedures. Population Undercoverage It is well known that household surveys rarely cover the population as well as the decennial census (see Citro and Cohen, 1985; Shapiro and KostaDuch, 1988~; SIPP is no exception. Thus, even after adjustment for nonresponse, the SIPP data for March 1984 covered only 85 percent of black men and 91- 93 percent of all other people when compared with census-based population estimates. By age, black men in the 20-39 age categories were generally the worst covered (see Table 3-12 in Chapter 3~. The Census Bureau uses ratio-estimation procedures to adjust SIPP sur- ~rey weights for population undercoverage. The weights are adjusted so that the population estimated from each survey agrees with the updated decen- nial census-based population estimates by age, sex, race, and Hispanic on- gin. SIPP weights are also adjusted to agree with the March Current Popu- lation Survey (CPS) weights by household type.~3 However, these ratio adjustments do not correct all coverage errors. First, they do not correct for the undercount in the decennial census itself: although it is minimal in total net undercount was estimated to be between 1 and 2 percent of the population in 1980 and 1990 it is substantial for some population groups. Thus, in 1980, an estimated 9-10 percent of black children under age ~ were missed, as were about 15 percent of middle-aged black men (see Pay, Passel, and Robinson, 1988:Tables 3.2, 3.3; Robinson, i3The CPS also exhibits undercoverage. For example, in March 1984, the CPS only cov- ered 84 percent of black men and 90-94 percent of all others. Coverage ratios for black men were even worse in March 1986 for both the CPS and SIPP (see Table 3-12).
214 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION 1990~.~4 Second, the ratio adjustments do not correct for characteristics other than age, sex, and ethnic origin on which the undercovered population might be expected to differ from the covered population.ls The correlates of undercoverage (besides age, race, and sex) are not definitely established. However, analysis of the 1980 census postenumeraiion survey and of other survey, administrative records, and ethnographic data suggests that census undercount rates are higher for the following groups (see Citro and Cohen, 1985; Fein, 19891: household members other than the head, spouse, and children of the head; unmamed people; people living alone or in very large households; and people residing in central cities of large metropolitan areas. In addition, there is evidence that the rate of undercount increases as household income decreases. Overall, these tentative findings suggest that minonties, unattached people, and low-income people are at much greater risk of not being covered in household surveys than other people and, hence, that undercoverage affects SIPP-based estimates of program eligibility and participation. However, quantifying the impact of undercoverage on estimates of welfare program costs and caseloads developed from SIPP is not straightforward. For ex- nmple, increasing the number of low-income households through a coverage adjustment would presumably enlarge the eligible pool for such programs as Aid to Families with Dependent Children (AFDC). On the other hand, adding "missing men" to some of these households might reduce the size of the eligible pool, depending on their relationship to the AFDC unit, their employment status, and their contribution to the household's resources. Household surveys other than SIPP experience substantial undercoverage for some population groups, but undercoverage may be particularly impor- tant for SIPP with its two main goals of improving information on income and programs. We urge the Census Bureau to conduct research on popula- tion undercoverage in SIPP, including simulation studies to assess the sensi- tivity of SIPP estimates to alternative procedures for adjusting for undercoverage The goal of such research should be to develop improved 14Clogg, Massagli, and Eliason (1986) review the potential impact of census coverage errors on direct and indirect uses of the data, such as weighting adjustments for sample surveys and denominators for vital rates, and cite several examples of important effects. 15Fay (1989) analyzed within-household undercoverage in the CPS relative to the decennial census, using a 1980 CPS-census match. His results are suggestive of ways in which weight- ing adjustments do not adequately compensate for household survey undercoverage. For ex- ample, he finds that about one-fourth of adult black men who are counted in the census but not in the CPS are household heads, whose households should be categorized as married-couple households in the CPS but instead are categorized as households headed by unmarried women. 16For an example of such a simulation, see Cohen et al (1991), which compared estimates of the AFDC-eligible population using an unadjusted March CPS and a version in which crude adjustments to the weights were made for undercoverage on such characteristics as household income and marital status, age, and race of household head.
METHODOLOGICAL RESEARCH AND EVALUATION 215 techniques for obtaining higher population coverage in the field as well as improved weighting adjustment procedures. 17 Evaluation of the Redesign It is not too early to begin planning the priority research that should be conducted after implementation of the SIPP redesign, with the goal of as- sessing its successes and identifying problem areas for timely correction. We briefly discuss below some of the topics that we believe will require careful study: attntion, length of recall penod, phase-in of CAPI, and the effectiveness of changed questionnaire content. We urge that the Census Bureau continue a program of small-scale forward record-check studies, in which sampled cases receive the same interviews as cases in the main sur- vey, to help evaluate the effects on data quality of venous features of the redesign. Full record checks that match SIPP sample cases, including re- porters and nonreporters, with administrative records to assess net reporting error, including both underreports and overreports, would also be useful to conduct periodically.ig The proposed redesign entails a significant extension in the length of SIPP panels from 32 to 48 months. We do not expect that cumulative attrition will increase very much because the available evidence is that most attrition occurs in the first few waves of a panel. However, the evidence from longer panel surveys is not directly relevant to SIPP with its short intervals between waves. The attrition effects of the new design must there- fore be carefully watched. We urge the Census Bureau to plan a major assessment of the attrition from the redesigned panels. The Bureau should plan to monitor attrition rates on a continuing basis and to carry out timely studies of the character- istics of households and respondents who do and do not drop out of the sample. Such studies can make use of data that are available from earlier waves for households that drop out. Special follow-up studies of nonrespondents at early waves may also be useful. In addition, analysis of information from administrative records for cases that drop out, using a forward record- check sample, could prove helpful in assessing the causes and consequences of attntion. 17We note that some work in this area has, in fact, been started. See Cantor and Edwards (1992), who report on a small-scale test to determine the effects on coverage of household members in SIPP of the current and an alternative procedure to obtain household rosters. 18We believe that the Census Bureau should be able to greatly reduce the delays that adversely affected the analysis of the SIPP record-check study (see Marquis and Moore, l990b), as more experience is gained with such studies, a new database r~anagement system is imple- mented for SIPP, and greater coordination is achieved among the SIPP project staff (see Chapter 8).
216 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION Should attrition at early waves increase more than expected, the Census Bureau should have resources set aside to permit timely experimentation with ways to overcome the problem. Finally, as noted above, an important part of the research that is devoted to estimation issues will need to be the development of more effective weighting and imputation procedures to minimize the effects of differential attrition from the longer panels. It is unlikely that sufficient information to determine the optimum re- call period for SIPP will be obtained prior to the redesign; hence, the Cen- sus Bureau will need to continue research on this topic. Again, forward record-check samples can provide a vehicle for experimentation-on recall effects. The Census Bureau will also need to plan a research program to assess the effects of the CAPI system that is implemented for SIPP versus paper-and-pencil techniques on the quality and comparability of the data across SIPP panels. Finally, the Census Bureau should plan studies of th quality of the data collected in response to new or modified questions in SIPP. Comparisons with other data sources should be helpful for this pur- pose, as should analysis of the information available in the forward record- check studies. Quick-Response Capability As we have noted, it will not be possible to anticipate every data quality problem or user need that may arise in SIPP after the redesign. It is impor- tant for the Census Bureau to have some reserve capability to conduct research and evaluation on new problems and concerns as they arise and to determine the best ways to respond. For example, evaluation of question- naire changes introduced as past of the redesign may reveal unexpectedly serious problems that require timely experimentation with revised content, edits, or other procedures. Or shifting policy interests may require more detailed coverage of a topic in SIPP, and the Bureau will need the capacity for timely field testing of new or alternative questions. Or it may be neces- sary to take steps to deal with unexpectedly high rates of attrition. Hence, it is only prudent for the Census Bureau to have contingency plans to permit timely assessment and corrective action in the event that a serious problem occurs with the redesign. Recommendation We believe that methodological research and evaluation is needed to inform and assess the SIPP redesign. The magnitude of the changes that are pro- posed for the redesign together with the relative newness of SIPP and its undeniable scope and complexity argue for a strong, multifaceted program.
METHODOLOGICAL RESEARCH AND EVALUATION Recommendation 7-1: The Census Bureau should support meth- odological research and evaluation for SIPP leading up to and following the survey redesign. The research program should include the following components: · research to improve the format and wording of the ques- tionnaire, making use of record-check studies and, to the extent possible, of findings from the current program of cognitively based questionnaire experimentation; · research targeted to other aspects of the current redesign (and to possible design changes later on), including the length of the recall period, screening techniques to obtain larger sample sizes for subgroups of interest, and data collection modes (the best combination of computer-assisted personal and telephone interviewing and the possible role of centralized telephoning); · research on issues of estimation and data use, taking into account the features of the redesign and including ways to im- prove cross-sectional and longitudinal weights, imputation pro- cedures, and population coverage; · research to evaluate the success of major elements of the redesign (e.g., the attrition effects of longer panels); and · a quick-response capability to address unanticipated prob- lems with the implementation of the redesign. CONTINUOUS ERROR MONITORING 217 With the focus on the upcoming redesign of SIPP, it is important that the Census Bureau not lose sight of the need for continuous monitoring of error levels in SIPP. Users need a constant flow of information about data quality as they seek to work with later panels and topical modules and to compare their results with work on earlier panels and other data sources. In this regard, we wholeheartedly support the plan of the Housing and Household Economic Statistics Division to compare income data from the March 1991 CPS and the 1990 SIPP panel. Many users will want to work with the 1990 SIPP panel because it offers the largest sample size of any SIPP panel yet fielded, and an in-depth comparison of SIPP and CPS data for 1990 will greatly benefit users. Continuous monitoring, which is a vital source of information for the SIPP staff at the Census Bureau as well as outside users, covers a range of topics and methods. For example, internal analysis of SIPP data generates rates of attrition and of person and item nonresponse. Internal analysis also generates estimates of the extent to which transitions are reported between pairs of months on and off the seam. Analysis of reinterviews of subsamples
218 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION of SIPP respondents may also provide useful information on questions that are not well understood or reported. Comparison of SIPP aggregates with aggregates from other sources is another form of monitoring, which may reveal differences that raise warn- ing flags for users and SIPP staff. To be meaningful, such comparisons require detailed understanding both of SIPP and the alternative source. For example, aggregates from the National Income and Product Accounts (NIPA) often serve as a basis of comparison for income totals from surveys, but extensive manipulation is usually required to achieve a valid comparison. Substantial revisions to the NIPA are under way or being planned-, and it is important that Census Bureau analysts keep abreast of these changes to the NIPA and their implications for evaluation of income data from SIPP and the March CPS. Record-check studies can also make a significant contribution to as- sessing error levels in SIPP. The small-scale forward record checks that we recommend as part of the research leading up to and following the redesign should be helpful in this regard. Also, it would be very useful to periodi- cally conduct reverse or full record-check studies based on matching SIPP panels with administrative records (e.g., tax returns or program case records). Reverse record checks that are based on probability samples of the entire SIPP universe are more expensive and time consuming to carry out, but they are the only way to obtain a full assessment of reporting errors, includ- ing false positives and false negatives. We note that the Census Bureau now has under way a match of the 1990 SIPP panel with an extract of the Individual Master File of tax returns from the Internal Revenue Service (IRS). Although the extract file contains a limited set of items, it can help evaluate certain types of income reports. We also encourage the Census Bureau to further exploit an existing resource: the reverse record-check study that involved matching records for a number of federal and state assistance programs in four states with SIPP cases in the first two waves of the 1984 panel. The analyses of the matched file conducted to date have been useful (see, e.g., Marquis and Moore, 1989, 1990a), but they did not go far enough. Very little marginal cost would be required to carry out additional analyses, such as investigating biases in duration of spells and reports of benefit amounts, and we urge that these be done at an early date. Such studies would benefit users and could also contribute useful information for consideration of possible improve- ments to the questionnaire.~9 |9The staff who worked on the SIPP record-check study are now engaged with other projects. If Census Bureau staff are not available for further analysis of the data, the Bureau should consider inviting an outside researcher to carry out the work through a fellowship or other on- site arrangement that permits access to the data.
METHODOLOGICAL RESEARCH AND EVALUATION 219 Another aspect of error monitoring relates to sampling error or van- ance. The Census Bureau needs to document for users the estimated sam- pling errors for estimates of subgroups from SIPP panels of different sizes and also consider innovative ways to reduce sampling error.20 The redesign of SIPP will entail important changes in the sampling scheme namely, an increase in overall panel size and the introduction of oversampling of lower income groups. The Census Bureau will need to thoroughly investigate the variance effects of the new design on estimates developed for subgroups of the oversampled, undersampled, and total population. Indeed, we urge the Census Bureau to plan an in-depth technical report for users on the variance implications of the new sample design. Looking to the future, we expect that the introduction of new data collection and processing technology (namely, CAPI and a new database management system) should improve the efficiency of the monitoring func- tion by making it possible for analysts to have hands-on access to the data at an earlier stage in the processing. Such access should make it possible for many of the results from monitoring to have an immediate, beneficial impact on survey operations (e.g., leading to changes in edit or imputation routines). We urge the Census Bureau to make improved access to SIPP data on the part of its analysts an important goal of its investment in new technology. Finally,- a critical part of the monitoring function is the provision of information to all users. For this purpose, it is important that the Census Bureau support such means of communication as the SIPP Working Papers and, most especially, regular updates to the SIPP Quality Profile. As in the past, these documents should include error analyses that originate from both Census Bureau staff and outside users. Recommendation 7-2: The Census Bureau should undertake continuous monitoring of error levels in present and future SIPP panels and regularly provide information on errors to users, in periodic updates of the SIPP Quality Profile and other publica- tions. 200ne avenue for reducing variance that the Census Bureau is considering is to adjust the SIPP sample weights, using some type of raking or iterative proportional fitting procedure, so that weighted estimates match control totals from administrative records. Muggins and Pay (1988) conducted preliminary research on the feasibility of this approach to reduce the vari- ance of income estimates from SIPP. In their study, the sample weights on a 12-month research file from the 1984 panel were adjusted to control totals derived from a 1 percent sample of individual income tax returns from the IRS. (Only the weights of SIPP cases that were successfully matched to a full IRS file were adjusted.) The results showed reduction of variance for such estimates as mean and median income.
220 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION COGNITIVE RESEARCH Innovative data collection and processing technologies such as CAPI and a database management system can benefit SIPP (and other surveys) in many ways, including making possible substantial improvements in data quality. However, technology cannot substitute for appropriate understanding and motivation on the part of the survey respondents. To the extent that respon- dents do not understand the questions to mean what the survey designers intended or are not motivated to search their memories or consult records to the extent necessary to provide an accurate response, then the quality of the data will necessarily suffer. A recent development in survey research has been the introduction of approaches from cognitive psychology to study in greater depth the ways in which respondents react to and interpret specific question wording. The results have often shown startling differences in perceptions between re- spondents and survey personnel (see Jabine et al., 1984~. Federal statistical agencies, including the National Center for Health Statistics, the Bureau of Labor Statistics, and the Census Bureau, are now making considerable use of cognitive techniques for questionnaire research and experimentation, such as one-on-one sessions in which a researcher probes the respondent after each question to ask what he or she had in mind in answering. Results The Census Bureau recently began applying cognitive techniques to the task of improving SIPP measures of income and program participation (see Mar- quis, Moore, and Bogen, 1991~. This work had its origins in the SIPP record-check study when analysis of the data did not produce clear findings for such phenomena as the seam problem or other response errors (although the analysis was limited). Specifically, there was little support for the forgetting theory of memory: that is, underreports of participation for most programs were no more likely to occur for 4 months prior to the interview than only 1 month. Exploratory observational research conducted in fall 1989 suggested that respondents often use simple heuristic devices, combined with a few recalled facts, to construct 4-month income streams instead of making the effort to develop a detailed recall or to check their records (e.g., they may derive monthly values from annual amounts). This research also suggested that misunderstandings about the intent of particular questions often occur because respondents do not understand the goals of particular sections of the questionnaire (Marquis, 1 990~.2i 21For this research, Census Bureau headquarters staff, who received special Gaining in think-aloud and direct questioning techniques, accompanied SIPP interviewers, took notes on
METHODOLOGICAL RESEARCH AND EVALUATION 221 Formal cognitive laboratory research conducted by Westat, Inc. (under contract to the Census Bureau) has begun to produce findings of relevance to understanding response errors in SIPP (see Cantor et al., 1991~. The research involved recruiting 125 respondents, about one-half of whom were participants in some type of government program, to receive wave 1 and wave 2 SIPP interviews. At wave 1, half the respondents were administered the regular interview, and the other half were administered that interview and additional procedures. These procedures included asking the respon- dents to think-aloud during the interview, repeat questions back in their own words, and answer additional detailed questions designed to obtain a second measure of recipiency and amounts. At wave 2, the additional procedures were used for both groups. Cantor et al. (1991:4-15) report some very provocative findings from the wave 1 interview on respondents' motivation, information storage, com- prehension, and information retrieval and formulation. On motivation, the questionnaire, as currently structured, does not encourage respondents to be active participants in the interview or to do their best to provide accurate information. For example, the questionnaire does not allow acceptance of information that respondents volunteer out of sequence, and it bores respon- dents with long lists of income sources of which only a few are usually relevant. Errors in reporting program information often result from the fact that participants do not know the name of the program or know it by another name than that used in the questionnaire. For example, elderly respondents in the Westat study often could not differentiate between social security, Medicare, and "medical assistance," which is the local Medicaid name in Maryland and the District of Columbia. In reporting earnings, many re- spondents think of net rather than gross pay, and they often have trouble aggregating weekly or biweekly paychecks to monthly units. Also, respon- dents often do not commit to memory the particular types of assets they hold, for example, whether they have a money market deposit account or a money market fund. Information on asset income amounts is even less frequently stored in memory. Comprehension is clearly a problem: respondents find the labor force questions complex, with many qualifying phrases that make them hard to answer. Respondents often do not understand the distinction between earned or accrued and received income. (SIPP wants the latter amounts, including, for example, a current paycheck for work performed some time back, but the interviews, interrupted at various points to ask respondents how they interpreted and an- swered questions, and prepared written summaries of their impressions and experiences. The interview sessions were also taped.
222 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION excluding, for example, interest on a certificate of deposit that has not yet matured.) A common mistake is for respondents to think about the reference pe- riod up to the date of the interview, which leads to a number of problems with respect to anchoring time points and calculating total income. To recall earnings, respondents tend to use a simple heuristic strategy, such as retrieving paycheck amounts and calculating the number of paychecks in a month, or multiplying the number of hours worked by an hourly rate, or dividing annual salary by 12. Once they have calculated the amount for the most recent month, respondents tend to apply this figure to the other months. Many respondents in the Westat study made errors in reporting earnings- based on comparing responses to the SIPP questions with an additional detailed recall most commonly because of misdating or miscounting pay- checks. Most respondents find it fairly easy to remember amounts for program participation, but very difficult to retrieve amounts for assets. They often guess at asset income-for example, calculating interest on the basis of the average balance in a savings account. Also, respondents find it too difficult and beyond their patience to develop accurate asset income amounts in response to the SIPP questions that require them to distinguish between separate and joint accounts, to aggregate across months, and to aggregate across asset types. Preliminary findings from the wave 2 interviews confirm the findings from wave 1 about the kinds of errors typically made by respondents.22 The results show evidence of the seam problem, particularly for amounts of asset income and wages. In many instances in which respondents report changes in amounts at the seam between the two interviews, it is because they make a fresh calculation at the wave 2 interview for the month prior to that interview and generalize the result to the other 3 months of the refer- ence period. Alternative Questionnaires and Interviewing Procedures It must be kept in mind that the findings from both the exploratory observa- tional research and the more formal cognitive procedures applied by Westat are based on limited samples. Nonetheless, there are strong indications that the current SIPP questionnaire may contribute to inaccurate responses for many income and program-related items. 220f the 125 respondents at wave 1, 76 completed a wave 2 interview (personal communica- tion, David Cantor, Westat, Inc.). The high nonresponse rate was due primarily to respondents' not returning phone calls or to their moving to a new address. Also, some of the wave 1 respondents were temporary residents of a homeless shelter and were no longer there at wave 2.
hIETHODOLOGICAL RESEARCH AND EVALUATION 223 Staff of the Census Bureau's Center for Survey Methods Research are currently experimenting with an alternative set of questionnaires and inter- viewing procedures (Marquis, Moore, and Bogen, 1991; Bureau of the Cen- sus, 1992b). In brief, the alternative being tested relies much less on re- spondent recall and much more heavily on the use of records. At the first interview, the goal of the core questionnaire, namely to obtain complete and accurate information on all sources of family income during the reference period, is explained and respondents are asked to bring out payment records. Respondents who receive but do not ordinarily save records are given a folder in which to keep them for future interviews. Using records-whenever possible, respondents are asked to help the interviewer fill out a worksheet for each income source that provides the amount and date of each payment. They can provide payment information for their income sources in any order (e.g., bank account interest before wages), and they do not have to compute monthly amounts. In other words, the respondents have the initia- tive in this portion of the interview, although the interviewers are trained to press respondents to think hard and not adopt a simple heuristic (e.g., re- porting all paychecks from their job as the same). To help ensure that no sources of income are overlooked, the inter- viewer next refers to flashcards that show about 50 sources of income and asks a short set of questions that are designed to jog respondents' memo- ries. Income sources are grouped in ways that seem likely to make sense to respondents: for example, asking about money from the military, including veterans' payments, military retirement, National Guard pay, and GI bill benefits; or asking about "surprises," such as an inheritance, lottery win nings, profits from gambling, insurance settlements, and work-related bo- nuses or awards. At subsequent interviews, to help reduce response errors, dependent interviewing techniques (reminding respondents of prior wave responses) are used. However, they are introduced late in the interview, in order to assist but not unduly influence respondents' recall. Also, the reference periods for pairs of interviews partly overlap, and differences in reported income sources and amounts are reconciled.23 Finally, all of the respon- dents in a household are urged to answer the questions in a group setting, and (at least during the testing period) they are asked for permission to record the interview. Preliminary results from initial small-scale field tests of these new methods are encouraging on some dimensions, although less so on others (Bureau of the Census, 1992b). On the positive side, much higher proportions of re- spondents used records than in regular SIPP interviews: for example, 65-80 23Specifically, the reference period for the first of two interviews extends up to the time of the interview, instead of, as now, stopping at the end of the month prior to the interview.
224 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION percent of income sources were supported by at least one record in the tests, compared with 20 percent in the regular SIPP interviews. Also, there was essentially no seam bias in reports of program participation: the ratio of transitions on or off all programs reported between waves 1 and 2 to transi- tions reported within a wave was about 1, compared with ratios of 2 to 6 for various programs in the regular SIPP interviews. Finally, most respondents were willing to have the interviews recorded, and most adult household members participated in a group interview. On the negative side, the wave 1 interviews took longer- and hence were more costly to complete, as respondents sought out their records. Also, the response rates were unacceptably low (64% after two waves), although the reasons for low response do not appear related to the new procedures, as most refusals occurred before the procedures were intro- duced. Rather, the causes appear to include the traditional problems of infrequent and inefficient call-backs to follow up nonresponse (e.g., too many daytime and too few evening calls to households). However, there remains the concern that the new interviewing procedures, by placing new demands on the interviewers, will require field personnel who are better trained and more highly skilled than is the case with the current procedures. Thus, there is the potential for interviewer variance and bias as well as increased costs of recruiting, training, and retaining interviewers who are able to consistently and accurately apply the new procedures. The SIPP cognitive research staff are currently planning a larger evalu- ation of the new procedures and questionnaire format in Milwaukee, Wis- consin, to include two waves administered in fall 1992-winter 1993. A sample of 700 households will be obtained from administrative records for four programs and the records of a local employer, permitting evaluation of the quality of many of the survey items through a forward record-check approach. Half the sample will be interviewed using the regular SIPP for- mat and half using the new format. The staff have also proposed adding a third wave, conducted by telephone, and a study to assess the ability to program the new questionnaire format into CAPI. Looking to the Future We are impressed with the effort that the Census Bureau has made to use cognitive research methods to understand and seek to improve the quality of SIPP responses. The promise for a very different way of relating to respon- dents that obtains high-quality responses seems strong. We were not in a position to comment on specific details of the current program of field testing and experimentation with alternative questionnaires and interview- ing procedures because the program was in the very early stages at the time of our deliberations. We urge the Census Bureau to seek continued review
METHODOLOGICAL RESEARCH AND EVALUATION 225 and guidance on this work from experts in the field (e.g., members of the ASAISRM Working Group on Technical Aspects of SIPP). Our major concern is how the research on alternative questionnaire design and interviewing fits in with the other planned improvements in SIPP data collection and processing technology and with the overall goal of implementing major changes in the 1995 or 1996 panel. Should the tech- nique of free recall of income sources, using worksheets to record payment streams, prove effective in the field, its use will have major implications for SIPP data processing. It seems possible that the equivalent of worksheets could be built into a CAPI system and that the necessary computations to produce monthly incomes from the individual payment records could be made within a database management system. However, an extensive amount of reprogramming would be required, given that work is already in progress to develop CAPI and database management systems for SIPP that assume that a questionnaire close to the current fixed-format document will be in use.24 It seems unlikely to us that such reprogramming could be accom- plished and fully tested in time for the redesign. In addition to the question of integration with the CAPI and database management system development, we are concerned about the time that will be required for rigorous evaluation of the new procedures, which differ greatly from current survey practice. We believe that the Census Bureau should have results from more than one full-scale test (as is planned for Milwaukee) in order to develop sufficiently reliable assessments of the cost, feasibility, and data quality implications before making these kinds of changes. Overall, it seems very ambitious to attempt to carry out the necessary testing and evaluation of the new procedures and achieve a seamless inte- gration with the CAPI and database management system development work prior to the SIPP redesign. But if the cognitive questionnaire research is treated entirely as an experimental program, the likely result is that it will be starved for resources and that any positive findings will have little im- pact on SIPP until at least the next major redesign in the year 2005. There is no easy way out of this dilemma. We urge the Census Bureau to consider an approach whereby a team of systems and research staff work on a prototype of an integrated system that includes the CAPI and database management system programs that will be required for a new questionnaire. A firm schedule should be developed for design and testing of the prototype so that positive outcomes can lead to changes in SIPP without waiting until the next major redesign. (Also, as we urge above, the findings from the 24The SIPP CAPI staff' in discussions with the study panel, indicated that they are making changes to the current questionnaire that reflect some of the cognitive research findings (e.g., reordering and clarifying questions and simplifying skip patterns). However, they are not currently planning for any type of free format.
226 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION cognitive research program should be used to the extent possible to improve the standard questionnaire.) Furthermore, we encourage the Census Bureau to devote adequate resources to the field test program so that results can be obtained on a timely basis from reasonably large samples of respondents in more than one site. Finally, as planned for the upcoming evaluation study in Milwaukee, all tests should build in the means to evaluate fully the effects of the new procedures on costs and data quality. Recommendation 7-3: We strongly support the Census Bureau's program of cognitively based research and experimentation with the SIPP questionnaire, which could contribute to questionnaire improvements for the current redesign and perhaps, in the fu- ture, to a major revision of the questionnaire and interviewing procedures. The Bureau should subject the cognitive work to rigorous evaluation, including record- check studies to evaluate data quality.