For the 2000 census, data capture of information on questionnaires was performed by scanning short-form and long-form returns into computer files and using optical mark and optical character recognition (OMR/OCR) to record the information. Clerks keyed data items from images when the automated technology could not read the responses. Keying of long-form-sample information was carried out in a second, separate process in order to permit the fastest possible completion of data capture for the basic items on all returns.
After data capture, long-form-sample records for households and their members could fall into one of two categories:
Long-Form Data-Defined At least one member of a household in the long-form sample was “long-form data-defined;” that is, at least one member had at least two long-form data items reported. All records for long-form data-defined households were retained in the sample. Any long-form housing or person items not reported, or reported inconsistently, had missing or consistent values supplied through item imputation, assignment, and editing. Imputations for any missing complete-count items that were performed during the basic data processing were retained (i.e., they were not reimputed during the long-form-sample processing).
Whole-Household Nonresponse Households that lacked any long-form data-defined persons were dropped from the sample. Weights were developed for long-form data-defined households and their members so that long-form-sample estimates agreed with complete-count totals on basic items. The weighting effectively adjusted for whole-household nonresponse.
In a procedure similar to that used in 1990, 2000 long-form-sample weights were developed to produce estimates for specified groups and geographic areas that agreed with estimates from the basic (complete-count) data records. A goal of the weighting was to