Skip to main content

Currently Skimming:

5 DATA COLLECTION AND PROCESSING
Pages 131-157

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 131...
... amount to 41 percent of the total. Data processing costs which about two-thirds are 1Our evaluation of SIPP operations and plans for future improvements is greatly indebted to the work of panel members Martin David and Randall Olsen and to Carol Sheets, head of the data processing staff for the National Longitudinal Surveys of Labor Market Expenence (NLS)
From page 132...
... Training Data processing Regional office data entry (keying) Regional office clerical operations Other regional office activities Data processing (headquarters)
From page 133...
... It appears likely that the structure of the questionnaire contributes to such data quality problems in SIPP as underreporting of asset income and confusion among program names. Also, paper-and-pencil techniques with such a long, involved questionnaire inevitably lead to ineff~ciencies and introduce opportunities for interviewer as well as respondent errors (e.g., transcribing errors and mistakes in following the skip patterns)
From page 134...
... However, there was some evidence, particularly for blacks, that maximum telephone interviewing produced lower estimates of the poverty rate and other measures related to low income and receipt of means-tested program benefits. Also, the experiments covered only two successive waves, so no info~n~ation is available on mode differences over a longer period Regional Office Operations The Census Bureau's 12 regional offices play an important role in processing SIPP data Clerks check the completed questionnaires mailed in by the SIPP interviewers for errors and omissions and assign geographic codes for sample people who moved.
From page 135...
... ; · performing extensive consistency edits within and between sections of the questionnaire, between the control card and the questionnaire, and among responses for people in the same family and household; · performing extensive sets of edits and imputations on each section of the questionnaire, including topical modules, to ensure that responses appear when they should and to impute missing values;4 · developing recodes based on combinations of data items to add to the data records; · checking the accuracy of geographic codes; · imputing an estimated household size for households that moved and could not be located, to use in the calculation of weights for movers; · calculating cross-sectional weights for each month in the wave; and · reformatting records and altering some data items to protect confidentiality as input to microdata files that are suitable for public release. Later, after all waves of a panel are processed, the data for selected items are further edited for consistency over time, longitudinal weights are developed, and a public-use longitudinal file constructed.
From page 136...
... The introduction of a new panel each year added greatly to the strain on the data processing staff, particularly given the need to rewrite large sections of computer code to keep up with changes to the questionnaire and to other aspects of the survey changes that were inevitable for a new, complex data collection program. As a result, delivery schedules deteriorated SRecalls were necessitated not only because of errors, but also because of design flaws.
From page 137...
... To enable the data processing to catch up, the Census Bureau decided in late 1987 to freeze the core questionnaire, permitting only changes that appeared absolutely essential to meet the survey's goals of providing improved data on income and program participaiion.6 The agency also strove to minimize changes in the fixed topical modules. This strategy was successful in that the Census Bureau began to meet its delivery targets of release of public-use files within a year of collection.
From page 138...
... We review the Bureau's plans for CAPI and database management technology for SIPP below.7 We also consider investment needs for continuing education of the SIPP data processing staff and issues involved in the transition to the new technology, together with the new survey design for SIPP. COMPUTER-ASSISTED INTERVIEWING There is currently considerable interest in the use of various methods of computer-assisted survey information collection (CASIC)
From page 139...
... At the present time, CATI, which is the oldest CASIC technique in use, is widely employed by govemment, academic, and private survey organizations in the United States and abroad. It is estimated that there are more than 1,000 CATI installations throughout the world (Subcommittee on Computer Assisted Survey Information Collection, 1990:11~.
From page 140...
... Both techniques are needed because the CPS uses maximum personal interviewing for the first month in which an address is in the sample and maximum telephone interviewing for the remaining interviews. The Census Bureau is also planning to convert SIPP data collection to CAPI methods by February 1995.
From page 141...
... Successful implementation of CAPI for SIPP should produce significant improvements in timeliness of data processing and analysis. If there is no imputation, weighting, or special coding to be done (i.e., industry and occupation)
From page 142...
... A CAPI system should make it possible to treat SIPP as a single questionnaire in which the control card is integrated with the core questions for each respondent, and the topical modules are treated as conventional skip patterns. Having a single questionnaire for SIPP coupled with a sound database structure that is driven by the questionnaire has major implications for improved quality and timeliness of SIPP operations.
From page 143...
... We consider the problems involved in the transition to CAPI in the last section of this chapter. Adhere would also be home office data processing costs to maintain and update the CAPI questionnaire and monitor transmission of data from 400 interviewers rather than from 12 regional offices.
From page 144...
... ; · the ability to display data collected earlier in the same interview in a summary form to help the interviewer (e.g., to show a history of on-thejob, vacation, and other labor force statuses) ; · the ability to incorporate responses from prior interviews into the current data collection and integrate these responses in a way that exploits the potential of dependent interviewing techniques; · the ability to backtrack to any previous question and then return to the current question in a manner that preserves the integrity of the skip patterns; of data; · the ability to interrupt an interview and resume it later with no loss · support for multilingual interviewing; · the ability to make changes to the questionnaire (modification of skip patterns, range restrictions, question wording, list of alternatives for close-ended field coding of verbatim responses)
From page 145...
... The designer must ferret out such skip errors as questions that cannot be reached, groups of questions that form cycles, and logical errors in the skip patterns. The absence of extensive diagnostics places a heavy burden on the designer and increases the need for extensive testing.
From page 146...
... The alternative is a database management system (see next section) that accommodates varying numbers of logical records for each interview and provides the flexibility so that data processing constraints do not again force compromises in data content.
From page 147...
... does not appear to meet the data collection requirements for SIPP, the Census Bureau should give high priority to investigating other available CAPI systems and determine the most appropriate system for SIPP. DATABASE MANAGEMENT The Census Bureau is currently In the process of updating its computing equipment, including replacing UNISYS mainframe, batch-onented processors with networked VAX computers that facilitate interactive processing and the use of modern database management technology.
From page 148...
... This simple but powerful structure is key to many of the advantages of relational database management technology, including its query and processing capabilities. However, for performance and other practical reasons, no current relational database management system (RDBMS)
From page 149...
... Instead, it should be possible for the Bureau to develop more extensive yet timely longitudinal imputations by using data from surrounding waves a goal that the SIPP data processing staff have indicated is a high priority i7 16Presumably, most of the editing will be performed within the CAPI system, but some additional editing may be required within the DBMS. Careful coordination of the CAPI and database management systems is also needed to achieve flexibility with regard to the questionnaire content.
From page 150...
... Fifth, the database management system developed for SIPP should provide a ready interface to such statistical packages as SAS and SPSS. Such interfaces will facilitate internal analysis of the data by Census Bureau staff both for evaluation purposes (e.g., analyzing the effect of imputations on lion: for ex~n~ple, wave 7 of the 1991 panel and wave 4 of the 1992 panel will be fielded at the same time.
From page 151...
... Finally, the database management system that is used to construct the SIPP database should also support construction of a complete corresponding database of the documentation. At present, there is no documentation database for SIPP that can be related to the data, which contributes to problems in releasing fully documented analysis files on a timely basis arid hinders users in obtaining a complete understanding of the file structures and data content.
From page 152...
... INVESTING IN THE DATA PROCESSING STAFF The Census Bureau has a distinguished history of making seminal contributions to data collection and processing technology. However, in recent years, the Bureau has lagged behind best practice and has lacked the hardware and software with which to implement state-of-the-art methods of data collection and processing.
From page 153...
... Recommendation 5-3: In view of the major advances that con tinue to occur in computing hardware and software, the Census Bureau should devote significant resources to continued educa tion and training of its data processing staff. In particular, the SIPP processing staff should take advantage of the experience of data processing facilities outside the Census Bureau that deal with longitudinal surveys.
From page 154...
... The new technologies must not only be fully developed and tested in their own right, but also mesh with other changes to SIPP, such as questionnaire content and format changes. The length and complexity of the SIPP questionnaire, which entail the need for complex editing and data processing procedures, and the frequency of interviews will pose substantial challenges to the smooth implementation of CAPI/database management technology.
From page 155...
... If additional time for development and operational testing would permit a smoother transition, then we believe that time would be well spent. We suggest that the Census Bureau consider the following schedule (see Table 5-2~: field a somewhat smaller panel of, say, 15,000 households in 1995 that uses CAPI and database management technology for data collection and processing.
From page 156...
... However, a sudden switch could pose other kinds of problems: for example, the regional offices would have to cope with an abrupt reduction in workload. Also, the data processing staff would have to make special efforts to quickly move the data from the most recent waves of existing panels so that the next wave could be CAPI and also prepare CAPI versions 20To the extent possible, the regular SIPP data products should be provided to users from Me 1995 panel, although, if problems arise, it may be necessary to release data products as "research" files or reports to be used with caution.
From page 157...
... Beginning in 1996, the Census Bureau would have only the CAPI/database management system to run.2i Recommendation 5-4: The Census Bureau should make every effort to ensure smooth implementation of CAPI and an improved database management system for SIPP under the new design of 4-year panels introduced every 2 years. One option that the Census Bureau should consider is to field a large-scale dress rehearsal panel in 1995 as a way to work out any operational problems.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.