For the past three decades, coverage measurement of the decennial census has employed a postenumeration survey to provide estimates of net coverage error for subnational and demographic domains based on dual-systems estimation. Coverage measurement has three potential purposes: (1) to inform census data users about the quality of the counts for various uses, (2) to inform how census processes might be modified to improve the quality of the next census, and (3) to modify or adjust the census counts for official purposes.
In the 2000 census, the coverage measurement program was referred to as Accuracy and Coverage Evaluation (A.C.E.). A.C.E. was expressly designed primarily with the third purpose—adjustment of the census counts—in mind. However, a 1999 Supreme Court decision forbade the use of sampling, and therefore A.C.E. for generating census counts used for apportionment of the U.S. House of Representatives. Ultimately, inconsistencies in the A.C.E. results led to Census Bureau decisions not to use these data to adjust the 2000 census counts for redistricting or other official purposes.
For the 2010 census, the use of a coverage measurement program to adjust apportionment counts is still precluded by the 1999 Supreme Court decision. The use of a coverage measurement program as a basis for adjusting the census counts for legislative redistricting is seen by the Census Bureau as problematic, given the lack of time for a comprehensive evaluation of the quality of the adjusted counts. Given these limitations, as well as other considerations, the Census Bureau has decided to change the primary objective of coverage measurement in the 2010 census to that of providing information to improve the next census. This is consistent with Recommendation 6.1 of the report The 2000 Census: Counting Under Adversity (National Research Council, 2004b) and is supported here.
Although these three basic purposes of coverage measurement are related, they place different demands on a coverage measurement program. The focus of coverage measurement for adjustment is to estimate net coverage error; for census process improvement, estimates of net coverage error are insufficient, since they may hide offsetting errors arising from problems with census processes. For example, an erroneous enumeration may or may not be a duplicate of another enumeration; for net error measurement it is not crucial to know if it is a duplicate, but this question is important for improving census processes.
Therefore, the focus of coverage measurement in the 2010 census will be on exploring the four basic coverage errors: omissions, erroneous enumerations, duplications, and enumerations in the wrong place. In addition, the overall census design for 2010 is considerably different from that of 2000, the primary differences being that (1) the census long form will be eliminated, (2) the field enumerators will use hand-held computing devices for nonresponse follow-up, (3) the Master Address File/TIGER system will be updated and improved, (4) there will be a major effort to identify duplicate counts in the census and remove them from the final tabulations—this effort includes the collection of data on alternate residences and a national data search for duplicates, and (5) the coverage follow-up interview will be expanded to try to identify and rectify possible omissions from the census and enumerations in the wrong place. Despite these changes to the coverage measurement goals and the census itself, the Census Bureau plans to rely again on a postenumeration survey to collect data for coverage measurement and on dual-systems estimation to estimate net coverage error. Simultaneously adjusting to the new goals for coverage measurement and to a new census design raises a number of complex problems. The Census Bureau has requested the assistance of the National Academies to review and critique their test and research efforts to plan the coverage measurement program in 2010.
This interim report of the Panel on Coverage Evaluation and Correlation Bias in the 2010 Census describes and reviews the research activities carried out to date by the Census Bureau in developing the coverage measurement program for 2010. The panel will provide more direction in its final report on several of the technical challenges facing the Census Bureau raised by these research activities in working toward 2010. Chapter 4 provides a list of the topics the panel hopes to address. Those of particular importance are (a) the data to save in 2010 to support the various coverage measurement models, (b) random effects modeling for small area estimation, (c) treatment of nondata-defined cases in logistic regression, (d) allowable covariates in the logistic regression models for correct enumeration status and for match status, (e) sample design for the postenumeration survey in 2010, (f) improvements in demographic analysis in 2010, (g) the products to use to inform about census component coverage error, and (h) very generally, how to best operate a feedback loop for census improvement. In addition, the panel proposes an analytic framework that may suggest additional research activities, which may also be expanded in the final report.
THE 2010 COVERAGE MEASUREMENT RESEARCH PROGRAM
The Census Bureau has initiated a number of important projects in response to the need to redesign coverage measurement and related activities. These research activities include:
Design of the coverage measurement program for the 2006 census test to collect information on various operational parameters to accommodate the changing goals of coverage measurement and to assess the potential for
contamination if the coverage follow-up interview overlaps in time with the postenumeration survey interviews in 2010.
Development of a framework for defining types of census coverage error and the assumptions needed for their estimation.
The matching of postenumeration cases with census enumerations that have minimal information (that had been previously judged as having insufficient information for matching).
Refining methods for identifying and removing duplicate enumerations in real time.
The use of logistic regression for net error modeling, replacing the use of poststratification and synthetic estimation.
The modification of the A.C.E. sample design for the postenumeration survey used in 2010.
In addition, the Census Bureau is making impressive progress in the creation of merged, unduplicated lists (referred to collectively as E-StARS) from various administrative records of both residential addresses and persons, which could have important implications for both the census and coverage measurement in 2010. The panel is impressed with the various research programs, which provide important information for use in coverage measurement in 2010. In this report, the panel offers advice on this research program in the following areas:
Use of cross-validation for assessing alternative logistic regression models for estimating match probability and correct enumeration probability.
Use of survey weights in the development and analysis of logistic regression models.
Appropriate selection of covariates, in the logistic regression models for match and correct enumeration probability. Also, the use of random effects to incorporate small-area variation in these models.
Sample design for the postenumeration survey to be used in coverage measurement in 2010.
Use of administrative records for assisting with coverage measurement in 2010.
There is a substantial research literature on why people are missed in the census, as well as a more limited literature on why people are duplicated and erroneously enumerated. Furthermore, there remains substantial information from A.C.E. in 2000 on why census coverage errors were made on various households and people. Building on this base, the goal should be to develop statistical models that incorporate what is currently known about the sources of census coverage error and that help create a feedback loop from census coverage errors to deficient census processes. Further development of such statistical models after the 2010 census will benefit from the availability of linked data on (a) person, household, and area characteristics; (b) the specific census processes used to enumerate a person; and (c) whether the person was missed, erroneously enumerated, enumerated in the wrong place, duplicated, or correctly enumerated.
The panel offers four recommendations concerning coverage measurement plans for 2010. They are as follows:
Recommendation 1: The Census Bureau should evaluate, for use in the 2010 Census Coverage Measurement Program, a broader range of models, most importantly logistic regression models, for net coverage error that include variables in addition to those used to define the A.C.E. poststratification. These should include a wider range of predictors (e.g., geographic, contextual, family and housing variables and census operational variables), alternative model forms (e.g., classification trees), and the use of random effects to model small-area variation.
Recommendation 2: The Census Bureau should choose one or more of the proposed uses of administrative records (e.g., tax record data or state unemployment compensation data) for coverage improvement, nonresponse follow-up, or coverage measurement and comprehensively test those applications during the 2008 census dress rehearsal. If a process using administrative records improves on processes used in 2000, that process should be implemented in the 2010 census.
Recommendation 3: The Census Bureau should collect data in the 2010 census to support development of a database that links person, household, and housing unit characteristics, census processes, and the presence or absence of census component coverage error. This database should also represent coverage errors, including erroneous enumerations, enumerations in the wrong place, duplications, and omissions. The use of this database would better identify the sources of high rates of census component coverage error.
Recommendation 4: Given the number of important research activities currently under way, the needed design of the coverage measurement programs in the dress rehearsal and in the 2010 census, and the additional research suggested by the panel, the Census Bureau should provide the coverage measurement group with sufficient resources to carry out its current research program, its planning activities regarding the dress rehearsal and the 2010 census, and the activities listed in this report.