Detailed Findings and Recommendations
FOR THE READER’S CONVENIENCE, we list below the 19 specific findings for 2000 and the 16 recommendations for 2010 that appear in Chapters 3–4 and 6–9 of this report. See Section 1-D for the panel’s general, overall findings.
OVERALL CENSUS DESIGN
Finding 3.1: The lack of agreement until 1999 on the basic census design among the Census Bureau, the administration, and Congress hampered planning for the 2000 census, increased the costs of the census, and increased the risk that the census could have been seriously flawed in one or more respects.
Recommendation 3.1: The Census Bureau, the administration, and Congress should agree on the basic census design for 2010 no later than 2006 in order to permit an appropriate, well-planned dress rehearsal in 2008.
Recommendation 3.2: The Census Bureau, the administration, and Congress should agree on the overall scheme for the 2010 census and the new American Community Survey (ACS) by 2006 and
preferably earlier. Further delay will undercut the ability of the ACS to provide, by 2010, small-area data of the type traditionally collected on the census long form and will jeopardize 2010 planning, which currently assumes a short-form-only census.
ASSESSMENT OF 2000 CENSUS OPERATIONS
Finding 4.1: The use of a redesigned questionnaire and mailing strategy and, to a more limited extent, of expanded advertising and outreach—major innovations in the 2000 census—contributed to the success achieved by the Census Bureau in stemming the decline in mail response rates observed in the two previous censuses. This success helped reduce the costs and time of follow-up activities.
Recommendation 4.1: The Census Bureau must proceed quickly to work with vendors to determine cost-effective, timely ways to mail a second questionnaire to nonresponding households in the 2010 census, in order to improve mail response rates, in a manner that minimizes duplicate enumerations.
Finding 4.2: Contracting for selected data operations, using improved technology for capturing the data on the questionnaires, and aggressively recruiting enumerators and implementing nonresponse follow-up were significant innovations in the 2000 census that contributed to the timely execution of the census.
Finding 4.3: The greater reliance on imputation routines to supply values for missing and inconsistent responses in 2000, in contrast to the greater reliance on telephone and field follow-up of nonrespondents in 1990, contributed to the timely completion of the 2000 census and to containing the costs of follow-up. It is not known whether the distributions of characteristics and the relationships among characteristics that resulted from imputation (particularly of long-form content) were less accurate than the distributions and relationships that would have resulted from additional follow-up.
Recommendation 4.2: Because the 2000 census experienced high rates of whole-household nonresponse and missing responses for individual long-form items, the Census Bureau’s planning for the 2010 census and the American Community Survey should include research on the trade-offs in costs and accuracy between imputation and additional field work for missing data. Such research should examine the usefulness of following up a sample of households with missing data to obtain information with which to improve the accuracy of imputation routines.
Finding 4.4: The use of multiple sources to build a Master Address File (MAF)—a major innovation in 2000—was appropriate in concept but not well executed. Problems included changes in schedules and operations, variability in the efforts to update the MAF among local areas, poor integration of the address list for households and group quarters, and difficulties in determining housing unit addresses in multiunit structures. Changes were made to the MAF development: a determination late in the decade that a costly complete block canvass was needed to complete the MAF in mailout/mailback areas and a determination as late as summer 2000 that an ad hoc operation was needed to weed out duplicate addresses. Problems with the MAF contributed to census enumeration errors, including a large number of duplicates.
Finding 4.5: The problems in developing the 2000 Master Address File underscore the need for a thorough evaluation of the contribution of various sources, such as the U.S. Postal Service Delivery Sequence File and the Local Update of Census Addresses Program, to accuracy of MAF addresses. However, overlapping operations and unplanned changes in operations were not well reflected in the coding of address sources on the MAF, making it difficult to evaluate the contribution of each source to the completeness and accuracy of the MAF.
Recommendation 4.3: Because a complete, accurate Master Address File is not only critical for the 2010 census, but also important for the 2008 dress rehearsal, the new American Community Survey, and other Census Bureau surveys, the Bureau must develop more effective procedures for updating and correcting the MAF
than were used in 2000. Improvements in at least three areas are essential:
The Census Bureau must develop procedures for obtaining accurate information to identify housing units within multiunit structures. It is not enough to have an accurate structure address.
To increase the benefit to the Census Bureau from its Local Update of Census Addresses (LUCA) and other partnership programs for development of the MAF and the TIGER geocoding system, the Bureau must redesign the program to benefit state and local governments that participate. In particular, the Bureau should devise ways to provide updated MAF files to participating governments for statistical uses and should consider funding a MAF/TIGER/LUCA coordinator position in each state government.
To support adequate assessment of the MAF for the 2010 census, the Census Bureau must plan evaluations well in advance so that the MAF records can be assigned appropriate address source codes and other useful variables for evaluation.
Finding 4.6: The enumeration of people in the 2000 census who resided in group quarters, such as prisons, nursing homes, college dormitories, group homes, and others, resulted in poor data quality for this growing population. In particular, missing data rates, especially for long-form items, were much higher for group quarters residents than for household members in 2000 and considerably higher than the missing data rates for group quarters residents in 1990 (see Finding 7.3). Problems and deficiencies in the enumeration that undoubtedly contributed to poor data quality included: the lack of well-defined concepts of types of living arrangements to count as group quarters; failure to integrate the development of the group quarters address list with the development of the Master Address File; failure to plan effectively for the use of administrative records in enumerating group quarters residents; errors in assigning group quarters to the correct geographic areas; and poorly controlled tracking and case management for group quarters. In addition, there was
no program to evaluate the completeness of population coverage in group quarters.
Recommendation 4.4: The Census Bureau must thoroughly evaluate and completely redesign the processes related to group quarters populations for the 2010 census, adapting the design as needed for different types of group quarters. This effort should include consideration of clearer definitions for group quarters, redesign of questionnaires and data content as appropriate, and improvement of the address listing, enumeration, and coverage evaluation processes for group quarters.
ASSESSMENT OF COVERAGE IN 2000
Finding 6.1: The 2000 Accuracy and Coverage Evaluation (A.C.E.) Program operations were conducted according to clearly specified and carefully controlled procedures and directed by a very able and experienced staff. In many respects, the A.C.E. was an improvement over the 1990 Post-Enumeration Survey, achieving such successes as high response rates to the P-sample survey, low missing data rates, improved quality of matching, low percentage of movers due to more timely interviewing, and substantial reductions in the sampling variance of coverage correction factors for the total population and important population groups. However, inaccurate reporting of household residence in the A.C.E. (which also occurred in the census itself) led to substantial underestimation of duplicate enumerations in 2000 in the original (March 2001) A.C.E. estimates.
Finding 6.2: The Census Bureau commendably dedicated resources to the A.C.E. Revision II effort, which completely reestimated net undercount (and overcount) rates for several hundred population groups (poststrata) by using data from the original A.C.E. and several evaluations. The work exhibited high levels of creativity and effort devoted to a complex problem. From innovative use of matching technology and other evaluations, it provided substantial additional information about the numbers and sources of erroneous census enumerations and, similarly, information with which to correct the residency status of the independent A.C.E. sample. It provided little
additional information, however, about the numbers and sources of census omissions.
Documentation for the original A.C.E. estimates (March 2001), the preliminary revised estimates (October 2001), and the A.C.E. Revision II estimates (March 2003) was timely, comprehensive, and thorough.
Finding 6.3: We support the Census Bureau’s decision not to use the March 2003 Revision II A.C.E. coverage measurement results to adjust the 2000 census base counts for the Bureau’s postcensal population estimates program. The Revision II results are too uncertain to be used with sufficient confidence about their reliability for adjustment of census counts for subnational geographic areas and population groups. Sources of uncertainty stem from the small samples of the A.C.E. data that were available to correct components of the original A.C.E. estimates of erroneous enumerations and non-A.C.E. residents and to correct the original estimate of nonmatches and the consequent inability to make these corrections for other than very large population groups; the inability to determine which of each pair of duplicates detected in the A.C.E. evaluations was correct and which should not have been counted in the census or included as an A.C.E. resident; the possible errors in subnational estimates from the choice of one of several alternative correlation bias adjustments to compensate for higher proportions of missing men relative to women; the inability to make correlation bias adjustments for population groups other than blacks and nonblacks; and the possible errors for some small areas from the use of different population groups (poststrata) for estimating erroneous census enumerations and census omissions. In addition, there is a large discrepancy in coverage estimates for children ages 0–9 when comparing demographic analysis estimates with Revision II A.C.E. estimates (2.6 percent undercount and 0.4 percent net overcount, respectively).
Finding 6.4: Demographic analysis helped identify possible coverage problems in the 2000 census and in the A.C.E. at the national level for a limited set of population groups. However, there are sufficient uncertainties in the revised estimates of net immigration (particularly the illegal component) and the revised assumption of completeness of birth registration after 1984, compounded by the
difficulties of classifying people by race, so that the revised demographic analysis estimates cannot and should not serve as the definitive standard of evaluation for the 2000 census or the A.C.E.
Finding 6.5: Because of significant differences in methodology for estimating net undercount in the 1990 Post-Enumeration Survey Program and the 2000 Accuracy and Coverage Evaluation Program (Revision II), it is difficult to compare net undercount estimates for the two censuses. Nevertheless, there is sufficient evidence (from comparing the 1990 PES and the original A.C.E.) to conclude that the national net undercount of the household population and net undercount rates for population groups were reduced in 2000 from 1990 and, more important, that differences in net undercount rates between historically less-well-counted groups (minorities, children, renters) and others were reduced as well. From smaller differences in net undercount rates among groups and from analysis of available information for states and large counties and places, it is reasonable to infer that differences in net undercount rates among geographic areas were also probably smaller in 2000 compared with 1990. Despite reduced differences in net undercount rates, some groups (e.g., black men and renters) continued to be undercounted in 2000.
Finding 6.6: Two factors that contributed to the estimated reductions in net undercount rates in 2000 from 1990 were the large numbers of whole-person imputations and duplicate census enumerations, many of which were not identified in the original (March 2001) A.C.E. estimates. Contributing to duplication were problems in developing the Master Address File and respondent confusion about or misinterpretation of census “usual residence” rules, which resulted in duplication of household members with two homes and people who were enumerated at home and in group quarters.
Recommendation 6.1: The Census Bureau and administration should request, and Congress should provide, funding for the development and implementation of an improved Accuracy and Coverage Evaluation Program for the 2010 census. Such a program is essential to identify census omissions and erroneous enumerations and to provide the basis for adjusting the census counts for coverage errors should that be warranted.
The A.C.E. survey in 2010 must be large enough to provide estimates of coverage errors that provide the level of precision targeted for the original (March 2001) A.C.E. estimates for population groups and geographic areas. Areas for improvement that should be pursued include:
the estimation of components of gross census error (including types of erroneous enumerations and omissions), as well as net error;
the identification of duplicate enumerations in the E-sample and nonresidents in the P-sample by the use of new matching technology;
the inclusion of group quarters residents in the A.C.E. universe;
improved questionnaire content and interviewing procedures about place of residence;
methods to understand and evaluate the effects of census records that are excluded from the A.C.E. matching (IIs);
a simpler procedure for treating people who moved between Census Day and the A.C.E. interview;
the development of poststrata for estimation of net coverage errors, by using census results and statistical modeling as appropriate; and
the investigation of possible correlation bias adjustments for additional population groups.
Recommendation 6.2: The Census Bureau should strengthen its program to improve demographic analysis estimates, in concert with other statistical agencies that use and provide data inputs to the postcensal population estimates. Work should focus especially on improving estimates of net immigration. Attention should also be paid to quantifying and reporting measures of uncertainty for the demographic estimates.
Recommendation 6.3: Congress should consider moving the deadline to provide block-level census data for legislative redistricting to allow more time for evaluation of the completeness of popula-
tion coverage and quality of the basic demographic items before they are released.
ASSESSMENT OF BASIC AND LONG-FORM-SAMPLE DATA
Finding 7.1: Rates of missing data in 2000 were low at the national level for the basic demographic items asked of everyone (complete-count items)—age, sex, race, ethnicity, household relationship, and housing tenure. Missing data rates for these items ranged from 2 to 5 percent (including records for people with one or more missing items and people who were wholly imputed). Rates of inconsistent reporting for the basic items (as measured by comparing responses for census enumerations and matching households in the independent Accuracy and Coverage Evaluation survey) were also low. However, some population groups and geographic areas exhibited high rates of missing data and inconsistent reporting for one or more of the basic items. No assessments have yet been made of reporting errors for such items as age, nor of the effects of imputation on the distributions of basic characteristics or the relationships among them.
Finding 7.2: For the household population, missing data rates were at least moderately high (10 percent or more) for over one-half of the 2000 census long-form-sample items and very high (20 percent or more) for one-sixth of the long-form-sample items. Missing data rates also varied widely among population groups and geographic areas. By comparison with 1990, missing data rates were higher in 2000 for most long-form-sample items asked in both years and substantially higher—by 5 or more percentage points—for one-half of the items asked in both years. In addition, close to 10 percent of long-form-sample households in 2000 (similar to 1990) provided too little information for inclusion in the sample data file. When dropped households and individually missing data are considered together, the effective sample size that is available for analysis for some characteristics is 60 percent or less of the original long-form-sample size.
Many long-form-sample items had moderate to high rates of inconsistent reporting, as measured in a content reinterview sur-
vey. Few assessments have yet been made of systematic reporting errors for the long-form-sample items, although aggregate comparisons of employment data between the 2000 census and the Current Population Survey (CPS) found sizeable discrepancies in estimates of employed and unemployed people—much larger than the discrepancies found in similar comparisons for 1990. No analysis of the effects of item imputation and weighting on the distributions of characteristics or the relationships among them has yet been undertaken, although analysis determined that changes in imputation procedures contributed to the 50 percent higher unemployment rate estimate in the 2000 census compared with the April 2000 CPS.
Recommendation 7.1: Given the high rates of imputation for many 2000 long-form-sample items, the Census Bureau should develop procedures to quantify and report the variability of the 2000 long-form estimates due to imputation, in addition to the variability due to sampling and weighting adjustments for whole-household weight adjustments. The Bureau should also study the effects of imputation on the distributions of characteristics and the relationships among them and conduct research on improved imputation methods for use in the American Community Survey (or the 2010 census if it includes a long-form sample).
Recommendation 7.2: The Census Bureau should make users aware of the high missing data rates and measures of inconsistent reporting for many long-form-sample items, and inform users of the 2000 census long-form-sample data products (Summary Files 3 and 4 and the Public Use Microdata Samples) about the need for caution in analyzing and interpreting those data.
Finding 7.3: For group quarters residents, missing data rates for most long-form-sample items were very high in 2000 (20 percent or more for four-fifths of the items and 40 percent or more for one-half of the items). The 2000 rates were much higher than missing data rates for household members and considerably higher than missing data rates for group quarters residents in 1990. The 2000 missing data rates were particularly high for prisoners, residents of nursing homes, and residents of long-term-care hospitals perhaps because of heavy reliance on administrative records for enumerating them.
Few assessments have yet been made of systematic reporting errors for group quarters residents for long-form-sample items, nor of the effects of imputations on the distributions of characteristics or the relationships among them. However, a systematic error was found in the imputation of employment status for people living in noninstitutional group quarters because of a particular pattern of missing data. The result was a substantial overestimate of unemployment rates for these people, so much so that the Census Bureau reissued employment status tabulations for household members only, excluding group quarters residents.
Recommendation 7.3: The Census Bureau should publish distributions of characteristics and item imputation rates, for the 2010 census and the American Community Survey (when it includes group quarters residents), that distinguish household residents from the group quarters population (at least the institutionalized component). Such separation would make it easier for data users to compare census and ACS estimates with household surveys and would facilitate comparative assessments of data quality for these two populations by the Census Bureau and others.
RACE AND ETHNICITY MEASUREMENT
Finding 8.1: People who marked more than one race category in the 2000 census (the first to allow this reporting option) accounted for over 2 percent of the total population and as much as 8 percent of children ages 0 to 4, suggesting that the multirace population will grow in numbers. Nearly one-third of multirace respondents were of Hispanic origin, as were 97 percent of people checking only “some other race.” Together, multirace and some other race Hispanic respondents accounted for about one-half of all Hispanics, indicating the ambiguities confronting measurement of race for the Hispanic group. Consistency of reporting of Hispanic origin (as measured by responses of E-sample households compared with matching P-sample households) was very high (98 percent); consistency of race reporting was also high for non-Hispanic whites, blacks, and Asians, but quite low for multirace respondents, and only moderately high for other groups. Both missing data rates and distributions for eth-
nicity and race are sensitive to differences in question format, order, and wording.
Recommendation 8.1: The Census Bureau should support—both internally and externally, in cooperation with other statistical agencies—ongoing, intensive, and innovative research and testing on race and ethnicity reporting. Particular attention should be given to testing formats that increase consistency of reporting and to methods for establishing comparability between old and new definitions and measures.
MANAGEMENT AND RESEARCH
Finding 9.1: From the panel’s observations and discussion with key Census Bureau staff, it appears that the decentralized and diffuse organization structure for the 2000 census impeded some aspects of census planning, execution, and evaluation. There was no single operational officer (below the level of director or deputy director of the Bureau) clearly in charge of all aspects of the census; the structure for decision-making and coordination across units was largely hierarchical; and important perspectives inside the Bureau and from regional offices, local partners, and contractors were not always taken into account. These aspects of the 2000 management structure affected two areas in particular: (1) development of the Master Address File (MAF), which experienced numerous problems, and (2) the program to evaluate census processes and data quality, from which results were slow to appear and are often of limited use for understanding the quality of the 2000 census or for planning the 2010 census.
Finding 9.2: The quality of documentation and usability varies among internal 2000 census data files and specifications that are important for evaluation. Generally, the A.C.E. Program followed good practices for documentation, and the A.C.E. files are easy to use for many applications. However, the lack of well-documented and usable data files and specifications hampered timely evaluation of other important aspects of the census, such as the sources
contributing to the Master Address File and the implementation of imputation routines.
Recommendation 9.1: The Census Bureau should mine data sources created during the 2000 census process, such as the A.C.E. data, Person Duplication Studies, extracts from the Master Address File, a match of census records and the March 2000 Current Population Survey, and the Master Trace Sample. Such data can illuminate important outstanding questions about patterns of data quality and factors that may explain them in the 2000 census and suggest areas for research and testing to improve data quality in the 2010 census and the American Community Survey.
Recommendation 9.2: In addition to pursuing improvements for coverage evaluation in 2010 (see recommendation 6.1), the Census Bureau must materially strengthen the evaluation component for census operations and data quality in 2010 (and in the current testing program) in the following ways:
Identify important areas for evaluations to meet the needs of users and census planners and set evaluation priorities accordingly;
Design and document data collection and processing systems so that information can be readily extracted to support timely, useful evaluation studies;
Use graphical and other exploratory data analysis tools to identify patterns (e.g., mail return rates, imputation rates) for geographic areas and population groups that may suggest reasons for variations in data quality and ways to improve quality (such tools could also be useful in managing census operations);
Explore ways to incorporate real-time evaluation during the course of the census;
Give priority to development of technical staff resources for research, testing, and evaluation; and
Share preliminary analyses with outside researchers for critical assessment and feedback.
Recommendation 9.3: The Census Bureau should seek ways to expand researcher access to microdata from and about the 2000 census in order to further understanding of census data quality and social science knowledge. Such data files as the 2000 A.C.E. E-sample and P-sample output files, for example, should be deposited with the Bureau’s Research Data Centers. To help the Bureau evaluate population coverage and data quality in the 2010 census, the Bureau should seek ways—using the experience with the Panel to Review the 2000 Census as a model—to furnish preliminary data, including microdata, to qualified researchers under arrangements that protect confidentiality.