1
Introduction

The goal of the decennial census is to count everyone in the country, once and in the right place, for the purpose of allocating representation in Congress. The census satisfies this goal only incompletely, as some people are omitted that should be included, and some enumerations are either duplicates, are in the wrong location, or are either not residents of the United States or are not people. These four components of coverage error have an important impact on the representation of demographic groups and geographic jurisdictions in Congress.

PROGRAM OBJECTIVES

Since the 1950 census there has been an effort by the Census Bureau to estimate the size of error in census counts for areas and demographic groups and to use the information to improve census processes. The programs to measure census coverage error are referred to as coverage measurement programs. In recent years, coverage measurement programs included a third objective—correcting the census for enumeration error, referred to as census adjustment. The techniques used in coverage measurement programs to understand the extent of enumeration errors are sample surveys, dual-systems estimation (DSE), and demographic analysis.

In contrast to the previous two censuses, the Census Bureau has decided that the 2010 census coverage measurement (CCM) program will have a new principal objective: to emphasize census improvement



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 7
1 Introduction The goal of the decennial census is to count everyone in the country, once and in the right place, for the purpose of allocating representation in Congress. The census satisfies this goal only incompletely, as some people are omitted that should be included, and some enumerations are either duplicates, are in the wrong location, or are either not residents of the United States or are not people. These four components of coverage error have an important impact on the representation of demographic groups and geographic jurisdictions in Congress. PROgRAM OBJECTIvES Since the 1950 census there has been an effort by the Census Bureau to estimate the size of error in census counts for areas and demographic groups and to use the information to improve census processes. The programs to measure census coverage error are referred to as cover­ age measurement programs. In recent years, coverage measurement pro­ grams included a third objective—correcting the census for enumeration error, referred to as census adjustment. The techniques used in coverage measurement programs to understand the extent of enumeration errors are sample surveys, dual­systems estimation (DSE), and demographic analysis. In contrast to the previous two censuses, the Census Bureau has decided that the 2010 census coverage measurement (CCM) program will have a new principal objective: to emphasize census improvement 

OCR for page 7
 COVERAGE MEASUREMENT IN THE 2010 CENSUS rather than census correction.1 As a result of this change, rather than focus on the measurement of net census coverage error for demographic and geographic subsets of the U.S. population, the coverage measurement program in 2010 will focus on measurement of the rates of the components of census coverage error for subsets of the population defined not only geographically and demographically, but also by the census processes used. The hope is to use this information to help identify census processes that are associated with a high rate of coverage error and then identify alternative processes to reduce the rates. This feedback loop will then help to facilitate census process improvement for subsequent censuses. Of course, an important secondary goal of the coverage measurement program still remains, which is to inform census data users about the net coverage error for large geographic areas and demographic groups. The shift in the principal objective of the coverage measurement program from that adopted in 2000 (and in 1990) stems from both the specific circum­ stances surrounding the 2000 census and broader dynamics. With respect to the 2000 census, a decision by the Supreme Court in 1999 precluded the use of census adjustment for purposes of apportionment of the U.S. House of Representatives if it is based on data from a sample survey (as it would almost certainly be). Furthermore, the time needed to carry out and review coverage measurement also very likely precludes the use of adjusted counts as input into redrawing the boundaries of the districts for the U.S. House of Representatives (unless the dates for the census, April 1, or for redistricting, April 1 of the following year, are changed). Also, the problem of census omissions has become a problem of erro­ neous enumerations (overcounts) and census omissions: Prior to 1990 the main coverage problem was census omissions, but at the national level in 2000 the number of erroneous enumerations was roughly the same as the number of census omissions.2 The new problem of both census omissions and erroneous enumera­ tions has arisen partly because of the effort to reduce the main problem of census omissions that dominated prior to 2000. In response to the chal­ lenge of reducing census omissions, between 1960 and 2000 the Census Bureau added a number of alternative ways in which households could be included on the Master Address File (including the Local Update of Census Addresses Program), and in which individuals could be enumer­ ated in the census (including the Be Counted Program). These additional ways for households and individuals to be included in the census certainly 1Actually, this is in a sense a return to the pre­1990 goal of coverage measurement. 2 While this balancing did not obtain for every demographic or geographic subgroup, it is also true that the differential nature of net coverage error was reduced from that of previous censuses. (For information on adjusted counts in 2000, see Schindler, 2006.)

OCR for page 7
 INTRODUCTION increased the number of duplicate enumerations, which contributed to the “balancing” of the undercount and the overcount in the 2000 census. In addition to the duplication resulting from new avenues for enu­ meration, there is evidence that a number of social dynamics are also increasing the potential for census overcounts (see National Research Council, 2006). First, the structure of households is becoming more com­ plicated, with more people having attachments to multiple households, including children in shared custody. Second, the number of people with multiple residences is increasing: This group includes people with vaca­ tion homes and “commuter marriages.” It has also been hypothesized that the quality of the enumerator workforce has decreased over time. The shift in the principal objectives of coverage measurement raises many interesting and important technical issues. For example: What sam­ ple design for the coverage measurement survey should be used? What estimation approaches should be used in support of the attempt to link error status to relevant census processes? What data products would best communicate the linkages between census component coverage error and census processes in need of improvement? PANEL CHARgE AND WORK PLAN At the Census Bureau’s request, the National Academies established the Panel on Correlation Bias and Coverage Measurement in the 2010 Decennial Census to examine the Census Bureau’s coverage measurement plans for 2010 with the following charge: This project involves a study of four issues concerning census cover­ age estimation with the goal of developing improved methods for use in evaluating coverage of the 2010 census. A panel of experts will conduct the study under the auspices of the Committee on National Statistics of the Division of Behavioral and Social Sciences and Education. The panel is charged to review Census Bureau work on these topics and recom­ mend directions for research. The panel’s work may require develop­ ment of statistical models to extend the dual­systems estimation (DSE) approach, and may also include suggestions for the use of auxiliary data sources such as administrative records. DSE, as applied to the 1990 and 2000 censuses, had several benefits as well as limitations as a means for estimating net census coverage. Some of the limitations were: 1. The approach was designed for estimating net census coverage errors and did not provide accurate estimates of gross coverage errors, i.e., of gross census omissions separate from gross census erroneous enumerations. In the DSE approach applied in the 1990 and 2000 cen­ suses, certain census enumerations classified as erroneous were balanced against certain coverage survey cases classified as nonmatches (census

OCR for page 7
10 COVERAGE MEASUREMENT IN THE 2010 CENSUS omissions) for the purpose of estimating net census coverage. Some of these paired census enumerations and coverage survey cases did not necessarily reflect gross errors. 2. The application of DSE in Accuracy and Coverage Evaluation (A.C.E.) Revision II during the 2000 census accounted for duplicates found in the census in a simplistic way due to lack of information as to which member of a duplicate pair was a correct enumeration and which was an erroneous enumeration. This led to estimation error, as did the simplistic treatment of A.C.E. cases (P­sample) that matched to census enumerations outside the search area. 3. The poststratification approach used to apply the DSE had cer­ tain limitations. First, the number of factors that could be included in the poststratification was limited because the approach cross­classified the factors, so that each factor added to the poststratification greatly split the sample. (Collapsing of poststrata was needed because many of the cross­classified cells had small sample sizes.) Second, the synthetic error that arose from the synthetic application of the poststratum cover­ age correction factors to produce estimates for subnational areas and population subgroups was not reflected in their corresponding variance estimates. 4. Comparisons of aggregate tabulations of DSEs with estimates from demographic analysis (DA), in both 1990 and 2000, suggested under­ estimation by DSE of persons missed by both the census and the cover­ age survey (correlation bias). In the 2000 A.C.E. Revision II, sex ratios from DA were used to determine factors to correct adult male estimates for correlation bias, assuming no correlation bias for children and adult females. This approach appeared effective for adult blacks, but there were concerns about the appropriateness of its assumptions for other race/origin groups (particularly Hispanics). Also, DA totals for young children (0–9) exceeded the corresponding aggregated DSEs from A.C.E. Revision II by a sufficient amount to suggest possible correlation bias in estimates for young children. The Census Bureau is interested in improving the DSE methodology to address the above issues to the extent possible, to develop improved methods for estimating coverage of the 2010 census, both in regard to net errors and gross errors. This original charge to the panel had four areas of focus: (1) to effec­ tively measure the components of coverage error rather than net cov­ erage error; (2) to improve the determination of duplicate status and the measurement of the rate of census duplication; (3) to assess the use of model­based alternatives to poststratification, including their impact on the ability to model local heterogeneous effects; and (4) to examine the use of demographic analysis to correct for correlation bias. It was also understood that the panel’s work might involve the review of other

OCR for page 7
11 INTRODUCTION statistical models proposed for estimation of net coverage error and the use of auxiliary data sources, such as administrative records, in DSE. Consistent with this, it was recognized that all the data retained from the 2010 census—not only the census enumerations themselves and the postenumeration survey and matching results, but also data collected by the various management information systems that monitor census processes—could prove useful in modeling census error rates and provid­ ing information on the sources of census error. Therefore, the panel was also asked to provide advice on what data should be retained from the 2010 census. During the course of the study, several other issues in connection with the panel’s overall task arose: a review of the Census Bureau’s draft docu­ ment providing a framework for defining and estimating components of census coverage error; examination of the possibility of estimating the match status of cases previously categorized as having insufficient information for matching, in order to reduce the number of cases clas­ sified as erroneous enumerations due to item nonresponse; assessment of the various alternatives that could be used to reduce or address con­ tamination due to the similarity and simultaneity of the census coverage follow­up interviews and the initial CCM interviews in 2010; the CCM postenumeration survey design. More generally, the panel considered any other limitations that the 2000 A.C.E. Program had in addressing the objective in 2010 of measuring the rate of census component coverage error. The panel took as given the basic design of data collection and match­ ing operations planned for census coverage measurement in 2010. The plans include a sizable postenumeration survey that will be matched to the census to assess match status for the housing units (and individual residents in those housing units) found in a sample of census block clus­ ters. The panel examined modifiable aspects of the data collection for the 2010 coverage measurement program, including the sample design, seeking possible improvements. The panel did not address the broader range of possible coverage measurement programs that might best sup­ port census improvement over time. A postenumeration survey that is matched to the census, along with a sample of census records that are matched to the census enumerations, can be used to directly support the new objective of census improvement because one can identify individual census enumerations that are dupli­ cates, erroneous enumerations, and enumerations in the wrong location. Furthermore, one can identify a sample of individuals that were omitted in the census enumerations. In addition, and crucially, one can identify the census processes that were used to enumerate these individuals, along with characteristics of the individuals, their households and housing

OCR for page 7
12 COVERAGE MEASUREMENT IN THE 2010 CENSUS units, and contextual variables. This information can then be analyzed using statistical models to link higher rates of each of the four types of census error and the associated census processes. Thus, a data collection and estimation program that was originally proposed to be used in an aggregate way for estimating net coverage error for large demographic and geographic groups is also very useful for identifying individuals of interest to populate a database to support statistical models predicting census coverage error. The change in objectives also suggests that rather than try to “fix” the census for net undercoverage using sampling­based statistical procedures, it may be preferable to use information on census coverage error to identify deficiencies in the decennial census processes. Finally, in the course of its work, the panel also considered the possible benefits of a broader program of research on census coverage measurement. The panel explored other activities that might support measurement of components of census coverage error. This work was undertaken while recognizing that plans are close to final as the 2010 census nears, with a view to plans for coverage measurement for 2020. In sum, the panel undertook to evaluate the Census Bureau’s plans for coverage measurement in the 2010 census and to provide suggestions and recommendations for changes and additions to those plans, given the new objective of measuring the rates of components of census cover­ age error, with the ultimate goal of assessing the contribution of various census component processes to census coverage error. PLAN OF THE REPORT Substantial portions of this report are taken from the material in the panel’s interim report (National Research Council, 2007). This report expands the panel’s work in five areas: assessment of duplicate status, missing data methods, the census coverage measurement sample design, improvements to demographic analysis, and treatment of the potential contamination of the census coverage measurement sample interview by the overlap in the field with the census coverage follow­up interview. To collect the necessary information for this study, the panel held six plenary meetings between August 2004 and July 2007. During the course of our meetings, Census Bureau staff described their current coverage mea­ surement research activities and intended directions for further work, their test and dress rehearsal plans, and their plans for the 2010 CCM program. Some of the Census Bureau’s research on net coverage error has been facilitated by the development of an A.C.E. research database. This database contains the data collected by A.C.E. to support estimation of net coverage error in 2000, and it is weighted to represent the additional

OCR for page 7
1 INTRODUCTION information collected from the national duplicates search and the evalu­ ation follow­up survey so that the net coverage error estimates produced are nearly identical to those from A.C.E. Revision II. This introductory chapter is followed by four chapters and three appendices. Chapter 2 first discusses types of census coverage error and the coverage error metrics for domains of interest. It then describes the three primary purposes for coverage measurement and DSE and demographic analysis, the two primary methods used to measure net coverage error. Chapter 2 also presents short histories of the U.S. census coverage measurement programs from 1950 to 1990, including a descrip­ tion of A.C.E., the coverage measurement program for the 2000 census. Chapter 3 examines how the 2010 census differs from the 2000 census with respect to the impact on the coverage measurement program for 2010. It looks in some depth at the treatment of duplicates in the 2010 census and the 2010 coverage measurement program, including the possi­ bility of contamination of the 2010 coverage measurement data collection through the application of the coverage follow­up interview. The chapter also discusses how the use of administrative records could potentially assist in both coverage improvement and coverage measurement for the 2010 census. Chapter 4 discusses a number of technical topics introduced by the various changes made in coverage measurement for 2010, including: the sample design for the census coverage measurement postenumeration survey in 2010; the use of logistic regression modeling as a substitute for poststratification in modeling net coverage error; how one compares competing models in this situation; and the treatment of missing data in net coverage error modeling, including the Census Bureau’s current plans for addressing missing data prior to fitting the logistic regression models in 2010. In relation to the issue of missing data, the chapter includes a description of an attempt by the Census Bureau to greatly reduce the number of cases that are considered to have insufficient information to support matching. The chapter concludes with a discussion of how to improve demographic analysis for use in census coverage measurement in 2010. Chapter 5 first briefly outlines the Census Bureau’s framework for defining and estimating components of census coverage error. It then considers potential variables for use in statistical models to assess cor­ relates of components of census coverage error. The chapter ends with a consideration of the purpose of the key output from the census coverage measurement program in 2010—the analytic capability to develop statisti­ cal models linking census coverage errors of various types to individual and household characteristics and census process variables.

OCR for page 7
1 COVERAGE MEASUREMENT IN THE 2010 CENSUS There are three appendixes. Appendix A provides additional details from the paper by Mulry and kostanich (2006). Appendix B provides additional details on the use of logistic regression models as a substitute for poststratification. Appendix C provides biographical sketches of panel members and staff.