Read "Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment" at NAP.edu

Page 32 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

3
ASSESSMENT OF THE CENSUS BUREAU’S CURRENT RESEARCH PROGRAM FOR COVERAGE EVALUATION IN 2010

The Census Bureau is currently engaged in a number of important research initiatives that they expect will improve their coverage evaluation program in the 2010 census. Part of this research effort has been focused on the design of the coverage evaluation programs for the 2006 census test and for the 2008 dress rehearsal, the program for the dress rehearsal representing the last major opportunity to test plans for coverage evaluation prior to the 2010 census. In particular, the Census Bureau has devoted considerable energies to researching new methods that would be effective in the measurement of components of census coverage error in the 2010 census.

In this chapter, we describe and assess both the census test design in 2006 and the other major activities of the coverage evaluation research program. We introduce this by comparing the plans for the 2010 and 2000 censuses and then describing the limitations of Accuracy and Coverage Evaluation (A.C.E.) in measuring census component coverage error. Following the Census Bureau’s terminology, we refer to the 2010 coverage evaluation program as census coverage measurement, or CCM.

HOW THE 2010 CENSUS DIFFERS FROM THE 2000 CENSUS

The 2010 census has an innovative design, resulting in a census that differs from its predecessor as much as any since the incorporation of mailout-mailback data collection in 1970. Furthermore, the design for the 2010 census is dramatically different from the 2000 census in ways that will appreciably affect the 2010 coverage evaluation program. In this section we outline how the 2010 census will differ from the 2000 census and how those changes are likely to affect CCM.

The primary differences between the 2000 and 2010 census designs, as currently planned, are

Page 33 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

A short-form only census. The Census Bureau has now fielded the American Community Survey (ACS), which is a continuous version of the decennial census long form. Therefore, under current plans there will be no long form in the 2010 census. This reduces respondent burden and will facilitate several aspects of data collection in the census, including data capture, data editing and imputation for nonresponse, the work of follow-up enumerators, and the management of foreign language forms and foreign language assistance. As a result, this change is likely to improve data quality.
Use of handheld computing devices for nonresponse follow-up. The enumerators that follow up nonrespondent households will now use a handheld computing device to (1) administer the census questionnaire (computer-assisted personal interviewing), (2) edit the responses in real time, (3) collect, save, and transmit the data to census processing centers, (4) help locate residences through the use of computer-generated maps (and possibly geographic coordinates), and (5) possibly help organize enumerator routes.
Improved MAF/TIGER system. The Master Address File (MAF) has been identified as being deficient. For example, see National Research Council (2004b: Finding 4.4). There are currently efforts to improve, for 2010, both the Census Bureau’s MAF and its geographic database, the TIGER (Topologically Integrated Geographic Encoding and Referencing) system. The MAF provides a list of household addresses, and TIGER is used to associate each address on the MAF with a physical location. The MAF/TIGER Enhancement Program includes (1) the realignment of every street and boundary in the TIGER database; (2) development of a new MAF/TIGER processing environment and the integration of the two previously separate resources into a common technical platform; (3) expansion of geographic partnership programs with state, local and tribal governments, other federal agencies, the U.S. Postal Service, and the private sector; (4) implementation of a program to use ACS enumerators to generate address updates, primarily in rural areas; and (5) use of periodic evaluation activities to provide quality metrics to guide corrective actions (Hawley, 2004). One motivation for this initiative was the recognition by the Census Bureau that many census errors and inefficiencies in 2000 resulted from errors in the Master Address File and in the information on the physical location of addresses.
Coverage follow-up interview. The Census Bureau is greatly expanding the percentage of housing units that will be administered a coverage follow-up (CFU) interview in 2010, in comparison to those in 2000 who were administered either the Coverage Edit Follow-up (CEFU) or the Coverage Improvement Follow-up (CIFU) interviews. CEFU was used to determine the correct count and characteristics for people in households with more than six residents (since the census form had space for information for only six persons), and the correct count for households with count discrepancies (e.g., differences between the number of separate people listed on the questionnaire and the indicated total number of residents). CIFU was used to determine whether addresses that were initially judged as being vacant were in fact vacant. The expansion of CFU over CEFU and CIFU was motivated by the recognition, partially provided by A.C.E., that confusion with residence rules made an important contribution to census coverage

Page 34 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

error. The CFU interview will be greatly expanded in 2010 to include not only those three situations, but also the following: (a) households with a possible duplicate enumeration identified by a computer match of the census returns to themselves, (b) other addresses at which at least one resident sometimes lived (to avoid enumerations in the wrong location), and (c) other people who sometimes lived in the household (to avoid undercoverage). The latter two situations will be detected through the addition of two “coverage probe” questions to the census form. However, due to resource and time constraints, the Census Bureau may be able to administer the CFU to only a subset of the qualifying households in 2010. The Census Bureau thinks that it may be able to follow up only 5 to 10 percent of the nation’s addresses for this purpose, but some preliminary estimates suggest that a larger percentage may satisfy one or more of these contingencies.¹ In that case, the Census Bureau may have to prioritize by selecting a subset of the qualifying households that were more likely to provide information that would result in a less erroneous count. Implementation of this operation will depend on information collected in the 2006 test census and the 2008 dress rehearsal.

Removal of duplicate enumerations in real time. As implied in (4) above, the CFU interview will be used to follow up suspected duplicate enumerations that are identified through use of a national computer search for duplicate enumerations, with the objective of determining which address of a pair of duplicates is the correct residence and consequently removing the erroneous duplicate enumeration from the census.

This new census design has some benefits for the coverage measurement program in 2010. Focusing on the collection of short-form data and the use of handheld computing devices might improve the quality of the information collected, thereby improving the quality of the matching of the postenumeration survey (PES) to the census. Having an improved and more complete MAF should reduce the extent of whole-household undercoverage. Finally, the national search for and field verification of duplicate enumerations should reduce the number of duplicates in the census, which may facilitate the estimation of component errors in the census and may also simplify the application of the net coverage error models used in dual-systems estimation (DSE). So the changes to the 2010 census design are also likely to improve the quality of the coverage measurement information provided in 2010.

It is important to emphasize that some of the changes to the 2010 census design were motivated by the results of the 2000 A.C.E. program. Specifically, the large number of erroneous enumerations, especially duplicates, motivated the expansion of the CFU interview, as well as the implementation of the national search for duplicates. Also, although not directly a finding from A.C.E., the recognition that the 2000 census Master Address File had a large number of duplicates and was otherwise of uncertain quality motivated some of the improvements of the MAF/TIGER system.

¹	We think that reasonable estimates may already be possible given data from 2000 and the later census tests. For example, the 2004 census test indicates that categories (b) and (c) may sum to 11 percent or so.

Page 35 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

LIMITATIONS OF A.C.E. IN MEASURING COMPONENT COVERAGE ERROR

A.C.E. in the 2000 census was planned from the outset as a method for adjusting census counts for net coverage error. Hence, A.C.E. focused on estimating net census coverage error rather than summaries of census component errors. For example, the limited geographic search for matches used in A.C.E. relied on the balancing of some erroneous enumerations and omissions that were actually valid E-sample enumerations but in the wrong location. Such errors could result, for example, from a geocoding error (placing an address in the wrong census geography) or enumeration of someone at a second home. Because such erroneous enumerations and omissions were expected to balance each other, on average, they were expected to have little impact on the measurement of net coverage error. Therefore A.C.E. did not allocate the additional resources that would have been required to distinguish these situations from entirely erroneous enumerations or omissions. Similarly, A.C.E. did not always distinguish between an erroneous enumeration and counting a duplicate enumeration at the wrong location.

The following are limitations of A.C.E. in 2000 for measuring census component coverage error:

Inadequate information collected as part of the census and the PES allowed too many mistakes in the A.C.E. final determination of Census Day residence. In 2000, comprehensive information was not collected from a household either in the census or in the A.C.E. interview regarding other residences that residents of a household often used or on other individuals who occasionally stayed at the household in question. This limited the Census Bureau’s ability to correctly assess residency status for many individuals. The Census Bureau intends to include more probes to assess residence status in the 2010 census questionnaire, in the census follow-up interview, and on the 2010 CCM questionnaires. Also, in 2010, the duplicate search will be done nationwide, not only for the PES population. In addition, the Census Bureau plans on incorporating a real-time field verification of duplicate enumerations in 2010. (For details on issues in determining correct residence, see U.S. Census Bureau, 2003.)
Also, nonresponse in the E- and P-samples complicated matching of the P-sample to the E-sample (for coverage measurement) and of the E-sample to the census (to identify duplicates). It also complicated estimation because it interfered with assigning a person to the correct poststratum (under the 2000 design) or creates missing values for predictor variables (as discussed below, under the proposed use of logistic regression in 2010). (For details, see Mulry, 2002.)
Furthermore, the missing data treatments used for individuals with extensive nonresponse failed to fully utilize the available data. Procedures are now being examined that make greater use of the available data, especially on household composition, to determine the match status of these individuals in 2010.

Page 36 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

Also, the methodology used for individuals who moved between Census Day and the day of the postenumeration interview (known as PES-C) resulted in a large percentage of proxy enumerations, which in turn resulted in matching error.² (PES-C was implemented in 2000 due to the early plans, later cancelled, to use sampling for nonresponse follow-up in the 2000 census.) The Census Bureau will probably return in 2010 to the use of PES-B (similar to the 1990 methodology), which relies completely on information from the inmover.
The A.C.E. Revision II estimates modified undercoverage estimates for adult black men using sex ratios from demographic analysis (ratios of the number of women to the number of men for a demographic group) to correct for correlation bias (for details, see Bell, 2001; Haines, 2002). This method assumes that the estimated adult sex ratios from demographic analysis are more accurate and precise than those from the A.C.E. For nonblack Hispanics, estimation of adult sex ratios requires a long historical series of Hispanic births and deaths and, more importantly, highly accurate data on the magnitude and sex composition of immigration (both legal and undocumented). The historical birth and death data for Hispanics are available only since the 1980s, and the available measures of immigration are too imprecise for this application. Consequently, this use of demographic analysis to modify A.C.E. estimates was not directly applicable to nonblack Hispanic males in 2000.³
The approach taken to estimate net census coverage error relied on balancing erroneous enumerations against omissions in cases in which there was insufficient information for matching E- and P-sample cases. Consequently, A.C.E. was not effective at estimating components of census error.
Poststratification is used to reduce correlation bias (see description in Chapter 2), since it partitions the U.S. population into relatively homogeneous groups. The number of factors that could be included in the poststratification used in A.C.E. was limited because the approach fully cross-classified many of the defining factors, with the result that each additional factor greatly reduced the sample size per poststratum. (For details of the 2000 poststrata, see U.S. Census Bureau, 2003.) The 2010 plan uses logistic regression modeling to reflect the influence of many factors on coverage rates without having to define a large number of poststrata.
Also, small-area variations in census coverage error that are not corrected by application of the poststratum adjustment factors to produce estimates for subnational domains (referred to as synthetic estimation) were not reflected in the variance estimates of adjusted census counts. The Census Bureau is examining the use of random effects in their adjustment models to account for the residual

²	PES-C collected information about whether a PES outmover household matched to the census through use of information about the outmover household (often using proxy information), but resulting matches were applied to the size of the inmover household rather than the size of the outmover household because the information on the number of inmovers was considered to be of greater reliability.
³	In support of this argument, it is useful to note that a majority of working-age (18-64) Hispanics are foreign-born—about 55 percent, whereas only less than 5 percent of whites and slightly more than 5 percent of blacks are.

Page 37 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

variation in small-area coverage rates beyond what is modeled through synthetic estimation.

In addition to these issues, other features of A.C.E., including aspects of data collection and sample design, made the 2000 A.C.E. less informative than it might have been in measuring census component coverage errors. As stated above, this was only to be expected given the focus of A.C.E. on producing adjusted census counts, well justified by the desire to remedy long-standing patterns of differential undercoverage of minorities in the census. However, with the new priority of measuring census component coverage error, a number of design and data collection decisions, within the general framework of PES data collection, remain open to modification. Furthermore, as we argue below, estimation of net census error also remains important for assessment of census component coverage error, specifically census omissions.

PLANS FOR COVERAGE EVALUATION IN THE 2006 CENSUS TEST

The goals for the 2006 test census relevant to coverage evaluation were as follows (U.S. Census Bureau, 2004):

To examine how the Census Bureau can improve the determination of Census Day residence in the CCM process through modification of the census questionnaire, the initial PES questionnaire, and the PES follow-up interview. This may be the most important problem facing coverage evaluation and the greatest opportunity for improvement, because the A.C.E. underestimated erroneous enumerations by 4.7 million people in 2000, and overestimated the P-sample population by 2.2 million, much of which was probably due to errors in enumerating people in their proper census location (see National Research Council, 2004b: 218 and 253, for details). A request for information on alternative addresses or additional part-time residents was not included in the 2000 census, which limited attempts to ascertain correct Census Day residence.
To test procedures for determining more accurately the location of a person’s Census Day residence outside the P-sample blocks for P-sample inmovers and for people with multiple residences.
To determine how the more extensive matching for duplicates and people with multiple residences (following up on information collected in the CFU interview) can be implemented with the anticipated resource and time constraints in 2010.
To identify additional data to be collected on census processes in support of the measurement and analysis of census component coverage error.
To measure possible contamination of the CFU interview by the (possibly simultaneous) collection of coverage measurement information and to assess the implications for CCM data collection and estimation.

The coverage evaluation program of the 2006 test census began with a PES, after the census data collection was complete, in which computer-assisted personal interviews

Page 38 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

were administered to an independent sample of approximately 5,000 housing units (drawn from the same address list as the census) to determine Census Day residence. This was followed by an automated and then clerical match to the census enumerations, with field follow-up of those with unresolved match status. A person follow-up interview was conducted simultaneously with the PES to collect additional data to resolve residence status for various situations.

Once matching was completed, the CCM program used DSE based on the usual E- and P-samples, except that the addresses for the P-sample were identical to those of the test census. This exception prevented the measurement of whole-household omissions in the test census. Movers between the time of the census and the collection of postenumeration survey (CCM) data utilized PES-B methodology, which counts the number of people resident in the CCM blocks at the time of the postenumeration survey rather than the number of people resident on Census Day. Information on the other locations at which a person might also have been counted was collected in a follow-up interview for households that indicated other residences on the census questionnaire. This was to assist in the assessment of correct residence and to better define omissions, erroneous enumerations, and duplications.

The CCM person interviewing used a laptop for the initial interview, and unresolved matches were followed up with personal visits. For census returns that provided a phone number, the CCM interviews were carried out by telephone, as in 2000. CCM personal visits did not begin until nonresponse follow-up was concluded. However, CCM interviewing was simultaneous with the CFU follow-up interview. There was an automated and computer-assisted clerical search for P-sample matches and duplicate census enumerations at the Census Day residence location, as well as at other locations where the person may have been counted. There was also an automated search across all census enumerations in the test site both for P-sample matches and for duplicate census enumerations. There was an attempted match to census enumerations that had a missing or deficient name or were otherwise difficult to match due to limited information to better estimate components of census coverage error (see discussion of KEs below.) No weighting or imputation was carried out for missing data, and coverage estimates were not produced. Finally, the Census Bureau will explore various estimation methodologies to generate estimates of components of census coverage error and net coverage error, conditional on the limitations of the census test, to examine whether sufficient and consistent data are being collected.

Unlike the case for decennial coverage measurement programs, no attempt was made to collect data to assess whole-household undercoverage. Also, no attempt was made to assess the undercoverage of individuals living in group quarters. (CCM is also planned to exclude group quarters, about 2.7 percent of the U.S. population, from coverage measurement in 2010. This is unfortunate, given the difficulty in counting the institutional population.) Data needed to estimate coverage error (both net coverage error and components of coverage error) for persons living in housing units will be assembled by census operation to support the linkage of census component coverage error with specific census operations.

Page 39 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

MAJOR ACTIVITIES OF THE CENSUS COVERAGE MEASUREMENT RESEARCH PROGRAM

The CCM research program involves several activities grouped into the following categories: (1) research on measuring components of census error, which includes development of a framework for coverage measurement, matching of cases with minimal information, and identification of census duplicates in real time; (2) research on models for net error, including alternatives to poststratification and synthetic estimation; and (3) research on contamination due to the CFU interview. We also examined preliminary ideas of the Census Bureau regarding the design of the CCM postenumeration survey and the current application of E-StARS to coverage measurement; E-StARS is the Census Bureau research program examining possible applications of merged, unduplicated lists of administrative records.

All of these research efforts support the objective in 2010 of measuring census component coverage errors. Matching cases with minimal information reduces the need to rely on imputation of match status and therefore more clearly determines whether those cases are errors and, if so, what type. The identification of duplicates clearly facilitates their estimation and reduces the estimated number of erroneous enumerations. Improved estimation of net error improves the estimation of the number of omissions. Finally, contamination of the CFU by the CCM interview could result in an unrepresentative census in the P-sample block groups and therefore bias the estimates produced by DSE. We now describe and comment on each of these areas of research in turn.

RESEARCH ON MEASURING COMPONENTS OF CENSUS ERROR

The Census Bureau’s Framework Paper

In considering the measurement of erroneous enumerations, omissions, duplications, and enumerations in the wrong place, it became apparent that the definitions of these coverage errors needed clarification (see National Research Council, 2004b: 252). The Census Bureau therefore decided to develop a framework of precise definitions of census errors, as well as what assumptions supported their estimation, to better guide development of its coverage measurement plan for 2010. The resulting draft document “Framework for Coverage Error Components” (U.S. Census Bureau, 2005) is an excellent attempt to provide this foundation.

This document defines erroneous enumerations as (1) duplicate enumerations, (2) people born after Census Day, (3) people who died before Census Day, and (4) people who are not residents of a housing unit in the United States. Omissions are people who should have been enumerated in the census but were not. Contrary to this, in A.C.E., which focused on net error, persons had to be enumerated in a housing unit within the

Page 40 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

search area of the residence (generally the relevant E-sample block cluster) to be considered correctly enumerated. In this new framework, the starting position is that persons must only be enumerated in a housing unit somewhere in the United States to be considered to be a correct enumeration. This definition of a correct enumeration used in the framework document is not Census Bureau policy; it is instead a useful starting point in developing a comprehensive and clear understanding of the measurement of census coverage error, with the expectation that the geographic dimension will be addressed in later expansions of the framework.

The varying amount of information available for census enumerations complicates the classification of census errors. Data-defined enumerations are those with at least two recorded characteristics; others are non-data-defined enumerations. Among the former, some enumerations have sufficient information for matching and follow-up (complete name and two additional characteristics), and others have insufficient information. The non-data-defined and insufficient information cases could be either correct or erroneous enumerations, since the data are often insufficient to make any further determination.

Finally, information to determine enumeration status is collected from the PES. The Census Bureau refers to the list of people that would be enumerated if the P-sample were applied nationally as the notional P-census. Thus, conceptually every potential enumeration falls into one of four cells: (1) those in both the P-census and the census, (2) those in the P-census but not in the census (census omissions), (3) those in the census but not in the P-census (erroneous enumerations and P-census omissions), and (4) those missed by both the P-census and the census.

Potential E-sample cases include correct enumerations and erroneous enumerations but not non-data-defined people or census omissions. The A.C.E. definition of E-sample erroneous enumerations also includes (a) correct enumerations in the wrong location and (b) enumerations with insufficient information for matching. Measurement of census component coverage errors requires separate estimates of the number of enumerations that are in the wrong location and the number of enumerations with insufficient information that are actually erroneous.

To assess the number of omissions, A.C.E. used the P-sample nonmatches, which under the new definitions could be omissions, people enumerated in the wrong location, or nondata-defined people. The challenge here in moving toward a focus on error components is to determine how many of those people were actually missed in the census.

To provide high-quality estimates of census component coverage errors in 2010, the Census Bureau needs to make progress on two fronts. First, it must reduce the inflated estimate of erroneous enumerations. Enumerations with insufficient information need to be examined further, enumerations in the wrong place need to be identified as such, and the remaining unresolved cases need to be treated as nonrandomly missing data. Second, a better method is needed to estimate the number of people missed by both

Page 41 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

the P-sample and the census. The current approach assumes independence of correct enumeration status and match status within poststrata, and failure of that assumption results in correlation bias.

Since net error is defined as omissions minus erroneous enumerations, one can estimate omissions by summing reliable estimates of net error and the number of erroneous enumerations. Since net error can be estimated by DSE minus the census, omissions may be estimated by taking the dual-systems estimate minus the census plus the number of erroneous enumerations. However, this estimation strategy needs to be improved through additional data collection to help distinguish enumerations in the wrong location and to better handle cases with insufficient information, as well as through better estimation of the number of people missed by both the PES and the census.

The framework document also addresses how to estimate these various error components and what assumptions they are based on. Additional information will be collected in 2010 regarding other residences at which someone might have been counted to determine more accurately whether a nonmatched P-sample enumeration is actually an omission and which of a set of duplicates is the correct enumeration. Furthermore, there will be greater efforts made to match cases with “insufficient” information. Finally, missing data models will be developed to treat cases that are not data-defined.

The panel supports the general approach described in this draft framework, which is consistent with recommendations in National Research Council (2004b). This is an important first step toward developing a feedback loop linking the measurement of census component coverage error to deficiencies in specific components of census processes.

The panel has some concerns about the proposed treatment of imputations in the draft framework. A focus on the correctness of an imputation as an enumeration is misplaced, as are concerns about the correctness of imputations of characteristics. Imputations are simply the means to an end, which is improved census estimation, and it is the quality of the estimates collectively that should be assessed. For example, if a characteristic of a known person is imputed, the question of whether that is the person’s correct value is of no interest. The critical question concerns whether census estimates that involve that characteristic are collectively improved by the imputation, which will tend to be the case if the imputation model is sensible. The same principle applies to whole-person imputations. (This approach is compatible with a focus on components of error, since the measures used are for aggregates rather than individuals.)

Finally, different errors may be important for different uses of the census numbers, so the framework should be sufficiently flexible to allow for aggregating component errors in more than one way. For example, for estimation of broad demographic distributions (to predict future Medicare enrollment), an error in age might be important, but misplacing a person geographically would be of little consequence. Conversely, for redistricting purposes, a person’s exact age is unimportant but

Page 42 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

geographical accuracy is critical. The panel hopes to examine this more in our final report.

Matching Cases with Minimal Information

In the 2000 census, for an enumeration to have sufficient information for matching and follow-up, it needed to include that person’s complete name and two other nonimputed characteristics. In A.C.E. in 2000, there were 4.8 million (sample survey weighted) data-defined enumerations with insufficient information for matching and follow-up, meaning that they contained two characteristics. These cases were coded as “KE” cases in A.C.E. processing, and we retain that terminology. A.C.E. estimation treated KEs as erroneous enumerations, and they were removed from the census enumerations prior to dual-systems computations. (If KEs are similar in all important respects to census enumerations with sufficient information for matching, removal from dual-systems computations increases the variance of the resulting estimates, but it does not greatly affect the estimates themselves.) Removal of KEs helped to avoid counting a person twice because matches for these cases are difficult to ascertain. Also, it was difficult to follow up these E-sample cases to determine their match status if they initially were not matched to the P-sample, because of the lack of information with which to identify the person to interview.

However, some unknown and possibly large fraction of these cases were correct enumerations. Therefore, removing these cases from the matching inflated the estimate of erroneous enumerations, and it also inflated the estimate of the number of census omissions by about the same amount, since roughly the same number that are correct enumerations would have matched to P-sample enumerations. Given that the emphasis in 2000 was on the estimation of net census error, this inflation of the estimates of the rates of erroneous enumeration and omission was of only minor concern. However, with the new focus in 2010 on estimates of components of census error, there is a greater need to find alternative methods for treating KE enumerations. One possibility that the Census Bureau is currently exploring is whether many of these cases can be matched to the P-sample data using information on other household members.

To examine this, the Census Bureau carried out an analysis using 2000 census data on 13,360 unweighted data-defined census records that were found to have insufficient information for matching, to determine whether some of them could be reliably matched. (For details, see Auer, 2004, and Shoemaker, 2005.) This clerical operation used name, date of birth, household composition, address, and other characteristics to match these cases to the P-sample. For the 2000 A.C.E. data, 44 percent of the KE cases examined were determined to match to a person who lived at the same address on Census Day and was not otherwise counted, with either “high confidence” or “medium confidence” (which are reasonable and objectively defined categories of credibility). For the 2000 census, this would have reclassified more than 2 million census enumerations from erroneous to correct enumerations, as well as a like number from P-sample omissions to matches, thereby greatly reducing the estimated number of census component coverage errors. For the remaining unresolved cases, the

Page 43 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

current view on the part of the Census Bureau is to treat them as missing data in the estimation of rates of component census error. However, this issue has yet to be studied.

The treatment of KEs can be viewed as another component of “error” in the same way that a person incorrectly geocoded is an error—that is, that it is a problem for processing but not a part of what we would call an omission or an erroneous enumeration. The use of the term “erroneous enumeration” for these cases in the past is inappropriate. Cases with insufficient information should be treated as having unknown or uncertain enumeration or match status. The term “erroneous” should be reserved for incorrect enumerations. The terminology used therefore needs to distinguish between types of error and the uncertainty associated with these types of error for particular cases.

The panel strongly supports this research. In considering further development of the idea, it would be useful to try to find out more about any characteristics associated with KEs in order to find out how to reduce their occurrence in the first place. (E-StARS might be useful for this purpose.) Furthermore, the clerical operation used to determine the status of KEs was resource intensive, and it would be useful to try to automate some of the matching to reduce the size of this clerical operation in 2010. Ultimately, the Census Bureau should rethink the definition of cases with insufficient information for matching more generally.

DISCOVERY OF CENSUS DUPLICATES AND P-SAMPLE MATCHES TO CENSUS RECORDS OUTSIDE THE SEARCH AREA

Duplication in the census can result from a number of different circumstances. Some possibilities include housing unit duplication in the Master Address File, counting college students both at home and away at school, counting children in joint custody at both parents’ homes, counting movers both at their current home and at their previous home, counting people with vacation homes both at their usual home and at their vacation home, counting friends and relatives at a home at which they are staying temporarily, counting people at both residences who have one residence to commute to and from work and a separate residence on weekends, and counting people in nursing homes and prisons and at a residence of their immediate family members.

The Census Bureau implemented a computer and clerical operation to identify and remove duplicate housing units during the middle of the 2000 census due to an indication that a large number of duplicate housing units were included in the census (see National Research Council, 2004b, for details). In this operation, potential duplicate housing units were identified through the use of both person and housing unit matching. Criteria were developed that attempted to distinguish between actual housing unit duplications and form misdeliveries (typical of multiunit structures), which were retained in the census, since there is often no error as a result.

Page 44 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

The Census Bureau subsequently carried out research to determine how many duplicate persons remained in the census (see, e.g., Mule, 2001, and Fenstermaker, 2002.) This was done by computer matching the E-sample to the entire nation’s enumerations, wherein a “match” was determined by agreement of birth date and year and first and last names. (This could not have been accomplished in any previous census, since name and date of birth had not been captured electronically prior to the 2000 census.) In addition to the computer search, Fay (2002, 2003, 2004) developed a series of increasingly refined statistical models that estimated the percentage of matches discovered in this way that were coincidental and therefore not real duplicates. This research indicated that there were 5.8 million duplicates in the census after the duplicate housing search, which was compared with the 1.9 million duplicates found by A.C.E. (National Research Council, 2004b: Chapter 6). This research also provided characteristics of people who were more likely to be duplicated, including minority children, college students, minority young adults, people duplicated between housing units and group quarters (especially correctional institutions and nursing homes), and minority men ages 25 to 64.

Given the success of this research effort, the Census Bureau is now planning to implement the identification of duplicates and the corresponding correct enumerations nationally and in real time in 2010. Similarly, P-sample enumerations will be matched to census records outside the P-sample search area (the surrounding block clusters).

Implementing these operations on the scale needed and under severe time constraints will present a number of challenges. It is sometimes difficult to distinguish between duplication and form misdelivery. Also, in some cases, it can be difficult to determine a person’s correct residence, for example, for children in a shared custody situation. Furthermore, the goal is to resolve the entire household roster rather than just determining which enumeration is correct. The Census Bureau needs to estimate the resources that will be needed to support this effort. Estimating various timing and resource issues through a census test will be difficult, since census tests involve field work for only a few counties, and there is typically no field validation of cases outside the test census area.

RESEARCH ON MODELS FOR NET COVERAGE ERROR

Even with a primary goal of estimating census component coverage error, there are still two important reasons to continue research on net coverage error models. First, as mentioned previously, models for net coverage error are needed to estimate the number of census omissions. Second, census data users find information on net coverage error useful. Reliable estimates of net coverage error indicate the degree to which demographic groups are differentially undercovered in the census, and they indicate the degree to which geographic jurisdictions are differentially undercovered. This helps users decide whether census counts should be used for various purposes.

Page 45 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

A.C.E.Research Database

The Census Bureau’s research on net coverage error has been greatly facilitated by the development of an A.C.E. research database. Briefly stated, this database contains the data collected through A.C.E. to support estimation of net coverage error in 2000, and it is also weighted to represent the additional information collected from the national duplicates search and the evaluation follow-up survey, so that the net coverage error estimates produced are nearly identical to those from A.C.E. Revision II.

The panel has made modest use of this database, and it commends the Census Bureau for supporting its development. It would be beneficial if this database could be made available to researchers outside the Census Bureau after addressing confidentiality issues. One possibility would be to make a confidentiality-protected version of this database available at the Census Research Data Centers.

Logistic Regression for Estimating Net Coverage Error

The Census Bureau is examining the use of logistic regression modeling to estimate net census error, replacing the use of poststrata and synthetic estimation. The motivation is to be able to utilize many more predictors, including continuous predictors, in fitting the probability of match and correct enumeration status.

Poststratification is mentioned in the earliest literature advocating DSE (Sekar and Deming, 1949), and it has been used in the census since the 1980 postenumeration program (PEP) to reduce correlation bias. This is accomplished by estimating the adjusted counts separately within poststrata. (Recall that C stands for the number of census enumerations, II represents the number of people lacking sufficient information for matching, CE represents the number of E-sample persons correctly enumerated in the census, E represents the number of E-sample enumerations, P stands for the estimate of the number of all valid P-sample persons, and M stands for the estimate of the number of P-sample persons who match with an E-sample person. Note that here, CE is defined consistent with the definition of a correct enumeration in A.C.E., that is, an enumeration that is located within the search area and is therefore not the definition of a correct enumeration found in the framework document.)

A perfect poststratification would partition the true population and the E-sample population so that the underlying enumeration propensities for individuals within a poststratum are identical. However, this is unattainable and therefore the practical goal is to partition the sample cases so that individuals are more alike within a poststratum than individuals are from different poststrata. If this is accomplished, correlation bias should be reduced. Poststratification also supports the use of synthetic estimation, which is used to carry down adjustments to very low geographic levels. Synthetic estimation makes use of coverage correction factors, . The factors are applied to any

Page 46 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

subpopulation within the poststratum by multiplying this factor by the relevant subpopulation’s census count to produce the adjusted count for that subpopulation. Estimates of the uncertainty about synthetic estimates for small areas combine estimates of the variance from the estimation of the undercount rates in each poststratum and residual variation due to heterogeneity of small areas within the same poststratum. The first component can be estimated by standard methods; approaches to the second are more difficult and are discussed in a later section.

While poststratification has the advantages of reducing correlation bias and supporting synthetic estimation, a major disadvantage is that it allows only a relatively small number of factors to be included in the poststratification scheme (and in the resulting synthetic estimation). This is because the Census Bureau includes the full cross-classification of most of the factors to define the poststratification, and, as a result, the poststrata quickly become very sparsely populated, despite the large sample size of the PES. Use of many poststrata thus improves homogeneity within poststrata at the price of estimates with high sampling variances. Furthermore, because the formation of independent poststratum estimates does not recognize that poststrata with similar characteristics are likely to have similar rates for matching and correct enumeration, separately treating those poststrata fails to pool data when it would be beneficial to do so.

An alternative to poststratification is logistic regression of the binary match/nonmatch and correct enumeration/not correct enumeration variables on the available predictive factors. This approach allows the inclusion of more factors in the model, since it does not require factors to be treated as categorical, and it allows high-order interactions to be included or omitted as desired. Poststratification is the special case of logistic regression in which the predictors are indicator variables for membership in the categories defining the poststrata and all interactions are included in the model (see Box 3-1). In theory, small-area estimates under logistic regression could improve on those provided through synthetic estimation by using more predictors to predict the probabilities of match and correct enumeration status, and hence reducing correlation bias.

Logistic regression was suggested for use in the general area of DSE by Huggins (1989) and Alho (1990) and specifically applied to census undercoverage in Alho et al. (1993). However, these studies did not consider unresolved cases and made use of the data only from the P-sample blocks, rather than the full census. Haberman et al. (1998) introduced some additional features that addressed the above limitations. They proposed two separate logistic regressions to model match status (using P-sample data) and correct enumeration status (using the E-sample data). To represent cases with unresolved match status (with a completely analogous discussion of correct enumeration status), two records are constructed, one “matched” and the other “unmatched,” and weights are used to represent the “probability” that a given record matched to the census, given the available characteristics. (Match and correct enumeration probabilities for unresolved cases could be provided by a computer matcher developed by the Census Bureau.) Survey weights are also attached to all the records to reflect the complex sample design.

Page 47 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

BOX 3-1

Logistic Regression as a Generalization of Poststratification

To see that logistic regression is a generalization of poststratification, consider the following generic logistic regression model (used for either modeling percentage matched or percentage correct enumeration): , where p_i is the probability that individual i in the PES matches a census enumeration or is a correct enumeration, X_ki is the value for individual i of the k^th explanatory variable, and β_k is the associated regression coefficient for X_ki. If we assume that each of the X_ki’s is a variable that equals 1 when the i^th individual is in the k^th poststratum, and 0 otherwise, then is chosen so that the observed matching rate in the k^th poststratum is equal to .

The Census Bureau’s approach is a generalization of this, whereby it uses several categorical variables at various levels (e.g., sex and age), and combinations of levels of the categorical variables play the role of the above variables for the individual poststrata. A simple example would be where are selected so that the observed matching percent in the i,j^th poststratum (relative to the i^th and j^th levels of the two classification variables), was equal to: , where indicates the regression coefficient for the relevant interaction term.

This approach is the Census Bureau’s leading candidate for net coverage error modeling in 2010.

Haberman et al. (1998) fit two separate logistic regression models. The first one, using P-sample data, models the probability of matching a P-sample case to the E-sample, and the second one, using E-sample data, models the probability that a census enumeration is correct. Relating this to DSE, in the expression , the probability that a census enumeration is correct is the second factor, and the probability of a match is the inverse of the third factor. Therefore, the two logistic regression models estimate two of three main factors in DSE, the remaining factor representing the number

Page 48 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

of matchable enumerations in the census. The regression coefficients of these two logistic regression models are fit by maximizing the weighted log likelihood, which is a measure of the goodness of fit of the logistic regression model to the data.

Using logistic regression, synthetic estimation can now be replaced by the following methodology. Letting represent the estimate from logistic regression of the correct enumeration probability for person i, and letting represent the estimate from logistic regression of the match probability for person i, the estimated number of people in a small area is the sum of the ratio over the individuals i in that area (ignoring the treatment of cases with insufficient information for matching).⁴ A grouped jackknife procedure is used to obtain the standard errors of the small-area estimates. If the explanatory variables are limited to those collected in the census, not characteristics or process variables from A.C.E. or CCM, small-area estimates can be computed directly using the method just described. However, this approach sacrifices the additional predictive power of covariates collected for cases in the P-sample. Techniques suggested by Eli Marks may be used to accommodate use of P-sample variables at the subnational level. For details, see Marks et al. (1974).⁵

A few complicating issues are raised when using these methods. One is how to incorporate the survey weights in the model-building and model-fitting processes. The CCM PES sampling weights need to be incorporated not only in the estimation of the logistic regression coefficients, but also in the decision as to which predictors to include in the logistic regression models and which model form to use, as well as in estimating the variance of the resulting estimates. The question of how to treat the complex sample design in these types of models has a substantial research literature. The approach taken by the Census Bureau is to weight the fractional cases by the inverse of the sampling weights. An alternative approach, which may produce more efficient estimates, is to

⁴

In this discussion, we are ignoring missing data in covariates, which introduce some complexities into the above development.

⁵

The Census Bureau has examined competing estimators that all have empirical deficiencies in comparison to the above estimate. As mentioned, the estimate for the population of a domain is for all individuals i in a domain. A competing estimator that the Census Bureau has mentioned is , which is now a sum over the individuals j in the PES blocks and in the relevant domain. Another competing estimator replaces the correct enumeration probability in these two alternatives by an indicator function for those individuals in the domain that had correct enumeration status, reducing the modeling to only the logistic regression model of match status. The problem with these two alternatives is that they are too sensitive to sampling variation. The Census Bureau has also considered variants of these two alternatives by reweighting the data elements so that the data-defined persons from the E-sample are ratio-adjusted to the census counts within poststrata. A further problem with some of these approaches is that small-area estimates do not necessarily sum to the estimates for larger areas.

Page 49 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

include the variables that make up the sampling weights as predictors in the model (e.g., Little, 2003). Comparisons of these two approaches would be of interest.⁶

A second complication is the treatment of missing data. Specifically, it is not clear how to treat cases with insufficient information for matching in the estimation of the logistic regression coefficients. Regarding the small-area estimation that results from the use of the logistic regression model, it is also not clear how to treat non-data-defined cases. We hope to provide more guidance on these issues in our final report.

Over the past two years, the Census Bureau has examined the use of logistic regression models to estimate net census error, focusing attention to date on the performance of six sets of explanatory variables for both the P-sample matches and the E-sample correct enumerations. These sets all use explanatory variables that are indicator variables of various combinations of the levels of the following six factors: race/origin (7 groups), age/sex (7 groups), tenure (owner, nonowner), metropolitan statistical area/type of enumeration area (MSA/TEA) (4 groups), region (4 groups), and mail return rate (high or low, with boundaries dependent on race/origin domain). The six sets are

The 416 indicator variables for the poststrata used in the March 2001 poststratification.
The 150 main effects and first-order interactions of the variables used to define the March 2001 poststratification.
The 23 main effects of the variables used to define the March 2001 poststratification.
The 98 main effects and all interactions from the variables for just three of the six factors from the March 2001 poststratification, that is, race/origin, age/sex, and tenure (omitting MSA/TEA, region, and mail return rate). The acronym ROAST is used to distinguish this reduced set of factors from the full set used in the 2001 poststratification.
The 62 main effects and first order interactions from ROAST.
The 14 main effects from ROAST.⁷

These six models were fit to data from the A.C.E. research database (for further details, see Griffin, 2005a).

⁶	Another complication that we will not discuss further here is that the adjustments made on the A.C.E. research file have resulted in the dependent variables occasionally lying outside of the interval (0,1).
⁷	The number of interactions does not correspond to the situation of fully-crossed effects since the poststratification used in 2000 did not fully cross the six variables. For example, the poststrata of American Indians or Alaskan Natives living on a reservation is only crossed by age/sex, and tenure, but not by MSA/TEA, region, or mail return rate, and this extends to the ROAST models.

Page 50 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

Model Comparisons

The Census Bureau compared the five logistic regression models (models 2-6) with the 2001 poststratification (model 1) for predicting A.C.E. data. Equivalence would seem sufficient in this comparison, since logistic regression can make use of many variables in addition to those used to define the poststrata.

When comparing nested models, that is, models that are identical except that some of the parameters in one of the models have been constrained to be constant (usually zero), the distinction between fitting and prediction is typically represented by adding a penalty for the number of additional parameters to a goodness-of-fit measure. As in linear regression, the additional parameters guarantee that the model with the larger set of predictors will fit the data at least as well, if not better, than the more parsimonious model, but this advantage may be offset by the increased variances of the fitted values, due to estimating more parameters. The combination of the goodness-of-fit statistic and the penalty for additional parameters reflects this trade-off. Haberman et al. (1998) suggested using a logarithmic penalty function, with a jackknife bias estimate to adjust for the use of unnecessary predictors (overfitting). The Census Bureau studied the use of goodness-of-fit tests based on the Satterthwaite approximation. Measures such as Mallows’ C_p , and the information criteria AIC, and BIC provide useful penalties for comparing regression models in a predictive situation.

The theory for comparing nonnested models is less straightforward, but such comparisons will also be needed. One such nonnested alternative separates the modeling of undercoverage into two models, one for the probability that an entire household will be missed, and another for the probability that an individual in a partially enumerated household will be missed (for details, see Griffin, 2005b). The panel and the Census Bureau agreed that cross-validation would be a suitable technique for comparing rival nonnested models. In cross-validation, the sample is split so that the model can be fitted to one part and the accuracy of predictions evaluated on the other part; the accuracy of prediction is thus not overstated due to fitting and evaluating the model on the same data. A standard approach is to split the data into several equal-sized pieces and remove each piece in turn from the fitting data set. The performance of each fitted model is assessed using some loss function in predicting the values for the set-aside fraction, and the loss function is averaged over all of the replications so defined. The Census Bureau implemented cross-validation using 100 equal-sized groups, and the loss function used was the logarithmic penalty function from Haberman et al. (1998). Finally, the average over all 100 groups was weighted using the A.C.E. survey weights.

The results of the Census Bureau’s cross-validation comparison of the five alternative logistic regression models to the 2000 A.C.E. poststratification (Griffin, 2005a) are given in Table 3-1.The column labeled Correct Enumeration provides the cross-validation statistic for each of the six models in estimating the correct enumeration rate, and the column labeled Match provides the cross-validation statistic for each of the six models in estimating the match rate. The orderings observed and to a substantial degree even the average weighted log likelihoods (not shown here) did not change when

Page 51 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

TABLE 3-1 Cross-Validation of Six Preliminary Logistic Regression Models: Average Weighted Log Penalty Function

Model	Number of Parameters	Correct Enumeration	Match
1. Poststratification	416	.2351	.2603
2. Main effects and two-way interactions	150	.2349	.2598
3. Main effects	23	.2354	.2598
4. ROAST	98	.2355	.2617
5. ROAST main effects and 2-way interactions	62	.2355	.2618
6.ROAST Main effects	14	.2360	.2619
NOTE: The logarithmic penalty function that was used in the cross-validation for the correct enumeration rate modeling was: , where W_E is the weighted total for the E-sample, w_cei is the sampling weight for the j^th E-sample individual, p_cei is the correct enumeration rate, and is the predicted correct enumeration rate from the model. An analogous function was used for match rate modeling. Given the negative sign in this expression, smaller values of this statistic imply a better fit to the data.

the number of cross-validation replications changed from 100 to 25 or 20 (also not shown here). On one hand, the similarity of fit across the models suggests that many of the interactions in the poststratification model are relatively small. On the other hand, these findings support the view that even the most effective of the five alternative models, model 2 (main effects and two-way interactions of the poststratification variables), offers only minor benefits over the full poststratification. However, these models are limited to use of the variables in the 2000 poststratification and do not assess the potential of other predictors or model forms.

The panel further examined the use of cross-validation to assess the impact of the use of survey weights on the performance of the model. To examine this, using the logistic regression model with only the main effects from the poststratification (model 3), we formed 100 groups for the cross-validation. (This was done in two ways to examine the degree to which the block clusters were homogeneous. In one computation, we randomly selected P- and E-sample persons into 100 groups for cross-validation without regard for block cluster membership; in the second computation, we randomly selected P-and E-sample persons into 100 groups maintaining the block cluster structure of the A.C.E. sample design.) Using the cross-validation, we compared the performance of the

Page 52 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

logistic regression model unweighted by the survey weights with the performance of the logistic regression model weighted by the survey weights, with performance assessed using weighted log likelihood penalty function. The results are given in Table 3-2.

TABLE 3-2 Examination of Variance of Cross-Validation and Impact of Survey Weighting

		Average Log-Likelihood Penalty Function over 100 cross-validation replications
		E-sample	P-sample
Unweighted
	Random selection	.278181	.327804
	Maintain clusters	.278484	.328091
Weighted
	Random selection	.235700	.261477
	Maintain clusters	.235982	.261930
NOTE: See note to Table 3-1 for details on the average log-likelihood penalty function.

The results suggest that use of the survey weights in computing the logistic regression coefficients substantially improves the performance in comparison to unweighted fitting, as assessed by the weighted criterion. This raises the possibility that inclusion of the survey design variables as predictors may provide some benefits.

Any predictors used in a logistic regression model must be available from census data to allow estimation of net census error nationally (at least in the form currently preferred by the Census Bureau). This restricts the available predictors to functions of the six factors used in the A.C.E. poststratification, a few additional variables from the short form in 2010, any variables collected during census processing, and contextual variables collected at aggregate geographic levels (say, from the American Community Survey or E-StARS). Recently, Schindler (2006) examined many of these other possible variables to see whether they provided substantial additional benefits as additional factors in producing post-strata (but not in a logistic regression approach). He considered the following variables: (1) geographic—census region, state, urban-rural, and mode of census data collection (mailout-mailback, list-enumerate, or list-leave); (2) contextual variables at the tract level—mail return rate, and percentage minority; (3) family and housing variables—marital status, relation to the head of household, and structure code (single unit or multiunit); and (4) census operational variables—indicator of mail or enumerator return, date of return, and proxy status. Schindler (2006) did not discover any variables that provided substantial benefit over and above that of the 416 indicator variables from the poststratification used in A.C.E.

This analysis, while extremely important, should not be considered conclusive, in particular in the context of a logistic regression model. For example, in the related problem of examining large numbers of subsets of a collection of possible predictors for

Page 53 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

use in regression-type models, it is difficult to know whether one has missed an effective combination of variables. This is difficult work, and it may be that further examination of potential predictors (or transformations of the predictors or interaction terms of the predictors) may still prove useful. (To assess the novel contributions of sets of covariates that are correlated, principal component analysis might provide useful insights.)

The panel strongly advocates further work in developing these logistic regression models, given their promise. We add two additional possibilities for broadening the approaches under consideration that the Census Bureau may wish to examine. The Census Bureau may look into the applicability of some of these methodologies, even if only to aid in the search for predictors for their logistic regression models.

First, it is not necessary that one logistic regression model be used nationwide. Different regression coefficients and even different predictors could be used for different geographic and/or demographic domains.

Second, logistic regression is only one of many statistical models that predict a dichotomous dependent variable. Of late, methods such as classification trees have been shown to have some applicability. One way to consider this research problem, broadened to encompass not only net coverage error modeling through use of logistic regression, but also census component coverage error measurement, is that these problems are essentially discriminant analysis problems. With respect to net coverage error measurement, the Census Bureau needs to identify variables that are predictive of match rate or erroneous enumeration rate. With respect to component coverage error, the Census Bureau needs to identify variables that are predictive of the rate of census omissions, erroneous enumerations, duplications, and being enumerated in the wrong location (for various definitions of wrong location). Taking the example of match rate, there are two types of individuals in the PES, those who match and those who do not, along with a number of predictors that might be helpful in the discrimination. Identification of these predictors might utilize logistic regression, but there might be advantages to the use of other, more flexible techniques, such as classification trees, recursive partitioning, support vector machines, and modeling with flexible link functions. For instance, classification trees develop a tree structure of decision rules, indicating a prediction of matched or unmatched, that identify subsets of the joint range of values defined by the possible predictors for which the percentage of matches or nonmatches is substantially different than for the overall data. Such an approach does not rely on the linearity used in logistic regression modeling and is therefore more flexible. Even if such an approach was not used in a production capacity in 2010, new information about what types of people or addresses are missed in the census might be discovered through use of these techniques.

What Are Legitimate Predictors in the Logistic Regression Model?

An issue concerning allowable logistic regression predictors is related to an issue that was raised when the poststratification design used for A.C.E. Revision II was modified from that used in the original A.C.E. The Census Bureau decided to include

Page 54 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

new factors in the poststratification that were available only for the E-sample due to their predictive power, resulting in different poststrata for the E-sample and the P-sample. The Census Bureau thought that these new factors would provide preferable poststrata for estimation of the probability of correct enumeration status. The new factors were (1) a variable indicating whether the E-sample enumeration was or was not a proxy enumeration and (2) a variable indicating the type of census return (early mail return, late mail return, early nonmail return, and late nonmail return). While the addition of these variables to the E-sample poststrata may have improved the partitioning of the E-sample into more homogeneous groups to reduce correlation bias and to improve synthetic (small-area) estimation, there was a concern that the difference in poststrata for the E-sample and the P-sample might cause a substantial number of failures in the balancing assumption. For example, a proxy enumeration often results in an interview with insufficient information. Insufficient information enumerations were treated in 2000 as if they were erroneous enumerations, and the P-sample enumerations that would have matched to those cases were treated as census omissions. Since the E-sample cases were proxy enumerations and therefore placed in a poststratum that did not exist for the corresponding P-sample cases for A.C.E. Revision II, these errors would be unlikely to balance.

It is clear that a similar issue arises with the use of logistic regression models of both the match rate and the correct enumeration rate, but it is substantially more difficult to assess. That is, P-sample information can be used to model matching rate, and census information can be used to model correct enumeration rate. If the variables for these two logistic regression models are different, coverage rate estimates for some combinations of these variables might be biased, although it is not known whether this would cause bias for the domains (defined by geography, age, race/ethnicity, etc.) that are of interest for census estimation. It is therefore important to determine when the benefit of improved predictive power outweighs the loss from balancing problems. The problem may be reduced in the 2010 census given the likely reduction in the number of “erroneous” enumerations; given the collection of data on census residence, the removal of duplicates in real time, other data improvement processes; and given the improved matching of KE cases.

In related research, Mulry et al. (2005) examined the following anomalous results in A.C.E. More than 5 percent of incorporated places⁸ in 2000 had an estimated net overcount of greater than 5 percent, and 0.5 percent had a net overcount of greater than 10 percent. This result runs counter to findings from the 1980 and 1990 coverage measurement programs of the potential degree of net overcoverage due to true erroneous enumerations and duplications. In contrast, only 0.1 percent of places had an estimated net undercount of greater then 5 percent, and nationally, the degree of overcoverage and undercoverage were of essentially the same magnitude in the 2000 census. There is a concern that the lack of balance of designated erroneous enumerations and designated omissions mentioned above may be due to the use of proxy status and the type of census return as poststratification variables for the E-sample but not for P-sample computations.

⁸	See http://www.census.gov/dmd/www/ACEREVII_PLACES.txt for a list.

Page 55 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

To examine this further, Mulry et al. (2005) demonstrated that by using proxy status in the E-sample poststratification, there were 91 places with a net overcount of greater than 10 percent, but if it is assumed that there was no error for proxy enumerations, this changes to only 16 places with net overcounts greater than 10 percent. Furthermore, if the assumption is made that there are no errors for proxy enumerations and also that there are no errors for late nonmail returns, the result is that there are only four places with a net overcount of greater than 5 percent. Given this and given that 27 percent of proxy enumerations had insufficient information for matching and follow-up, it is clear that proxy enumerations could be involved with substantial balancing error. The Census Bureau concluded that proxy enumerations contributed to these anomalous findings, but the judgment was that this was not the only cause.

Related research carried out by Spencer (2005) examined the quality of synthetic estimates for block clusters based on A.C.E. Revision II estimates, either using 938 E-sample poststrata and 648 P-sample poststrata, or using the same 648 poststrata for the E-and P-samples. His findings, in which the standard of comparison was either (a) the direct dual-systems estimate or (b) the census count plus people found in the P-sample who were omitted in the census for each block cluster, suggested that coarser but consistent poststrata may have provided more accurate estimates of net coverage error than finer poststratifications based on different E- and P-sample stratifications. However, for large blocks with proxy rates greater than 10 percent, the finer and inconsistent poststrata performed better.

A concern raised by Alho (1994) is whether a problem is caused by the use of census operational variables as predictors in these models. Since proxy enumeration and type of census return are both operational variables, it is possible that they should not be included as predictors in these models. Furthermore, as argued in Griffin (2000), it is conceivable that errors in responses, such as for household composition, could result in persons either being assigned to the wrong poststrata or being given incorrect covariates for use in logistic regression models, resulting in the failure of errors to balance. The panel hopes to address this general topic of which variables should and should not be included in the logistic regression models in its final report.

Modeling Geography Via Random Effects

In addition to the systematic effects of the variables described above, match rates and correct enumeration rates may also vary across the local census offices used to manage the workload in the census. The local office identifiers are on the A.C.E. research database, but they were not included in the six logistic regression models described above or the study by Schindler (2006). The reason census office indicator variables might be predictive of match and correct enumeration rates is because factors that are particular to small areas could affect ease of enumeration. For example, local economic conditions and the expertise and capabilities of local census office administrators could vary. Because of the large number of local census offices (over 500) and the limited amount of data for each, these effects are more naturally represented as random effects. By including these random effects in the logistic regression models,

Page 56 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

the Census Bureau could estimate the effects of individual offices on match and correct enumeration rates and obtain valid estimates of the contribution of variability across offices to uncertainty about coverage rates in each area.

Malec and Maples (2005) explored this approach by incorporating random effects into a synthetic estimation model and then measuring the variance component of the random effects for local census offices. The ultimate objective of this approach is a small-area estimation methodology that would provide a compromise between synthetic estimation and a separate design-based estimator for each local office area.

Because of the A.C.E.’s complex design (weighted cases within samples of block clusters), many of the empirical correct enumeration rates and match rates used in Malec and Maple’s model are more variable than the nominal sample sizes would indicate. To account for the extra variability, Malec and Maples (2005) used a pseudo-likelihood approach with effective sample sizes estimated via the bootstrap.

In this approach, both logistic regression models (for match rate and correct enumeration rate) have the following generic form: where β_i is the fixed effect for i^th poststratum membership, μ_k is a random effect for the k^th local census office, and α_ik is model error. Furthermore, , and , where ce(i) is an index representing the collapsing of the poststrata into 11 or 8 cells, depending on whether the model is applied to the E-sample or the P-sample. Malec and Maples (2005) were able to estimate the large number of parameters in these models using Bayesian simulation.

This research suggests that inclusion of small-area effects could substantially improve coverage estimates. The work is still preliminary, and there remain outstanding questions concerning how best to treat the complex sample design, how many random effects can be included and at what level of aggregation, what is the best way to estimate the model parameters, and how should model fit be assessed. The panel was impressed with this high-caliber research addressing an important issue in coverage modeling and strongly advocates further work in this area.

PANEL COMMENTS ON THE RESEARCH ON LOGISTIC REGRESSION MODELING

The immediate objective of the Census Bureau’s research on logistic regression is to determine whether it is preferable to poststratification for estimation of net coverage error in 2010 for small domains. Part of this assessment must include whether the model can be relied on in a production environment. The panel supports this research, anticipating that it is likely to identify an approach that will be preferable to poststratification. This research is consistent with arguments in National Research Council (2004b) supporting the use of model-based alternatives to poststratification.

Page 57 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

However, the panel would like the Census Bureau to explore a wider range of options in determining what model form and predictors work best in a predictive environment. Also, the Census Bureau’s comparison of the logistic regression approach to poststratification when the logistic regression predictors are restricted to those used in the poststratification ignores a primary benefit of logistic regression of accommodating a larger number of predictors. This suggests that a more appropriate comparison is between the 2000 poststratification and logistic regression models with additional variables determined to provide additional predictive power.

Furthermore, there have been a variety of studies, outlined in Chapter 4, especially ethnographic work, that provide information as to why certain housing units are missed in the census and why people with various characteristics are missed in otherwise enumerated housing units. This information is moderately consistent with the variables currently included in the logistic regression models being examined by the Census Bureau, but the linkage between the research findings and the predictors in these models is not as direct as one would like. We think that the logistic regression models need to represent what is known about the sources of census coverage error, to the extent that this information is represented on the short form and in available contextual information. There also seems to be an unnecessary rush to pursue a model that can be used in a production environment, while there is still time to operate in a more exploratory manner. The panel therefore thinks that the Census Bureau has been too cautious in its examination of potential sets of predictors. The six models that have garnered the majority of attention to date are too similar to learn enough about what model forms and collections of predictors will work well. The Census Bureau should therefore expand the important research carried out by Schindler (2006) and apply it to the logistic regression models, attempting to identify unanticipated correlations between match rate or correct enumeration rate and the available predictors, using cross-validation to evaluate the resulting logistic regression models.

With respect to model form, the Census Bureau has also carried out some preliminary work on a very different use of two logistic regression models to model census net coverage error (see Griffin, 2005b). The first logistic regression models the probability that a housing unit will be missed in the Census Bureau’s Master Address File. The second logistic regression model, conditional on the first, estimates the probability that, given that the housing unit is included in the Master Address File, an individual with certain characteristics will be missed. A number of details remain unclear with this approach, including how to handle erroneous enumerations and duplications. However, the panel strongly endorses further work on this and other modeling ideas that, even if not used in a production environment, will add to the Census Bureau’s understanding of census coverage error.

Finally, the switch from use of poststrata to logistic regression modeling has important implications for census data users in communicating summaries of net coverage error. First, logistic regression modeling is likely to be more statistically efficient in its use of data than poststratification and, if so, may support estimates at lower levels of geographic and demographic aggregation. Therefore, the Census Bureau should

Page 58 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

examine what the reliability will be for estimates of various levels of aggregation and consider releasing estimates at a more detailed level than the A.C.E. poststrata, should the estimates support that. Second, for ease of comparison, while there is likely to be no poststrata in 2010 due to the use of logistic regression modeling, the Census Bureau should consider release of estimates of net coverage error for the 2010 census for comparable aggregates to support the comparison of net coverage error from census to census.

Recommendation 1: The Census Bureau should evaluate, for use in the 2010 census coverage measurement program, a broader range of models, most importantly logistic regression models, for net coverage error that includes variables in addition to those used to define the A.C.E. poststratification. These should include a wider range of predictors (e.g., geographic, contextual, family, and housing variables and census operational variables), alternative model forms (e.g., classification trees), and the use of random effects to model small-area variation.

The panel hopes to provide more guidance on the issue of which covariates to include in these logistic regression models in its final report. In the mean time, the Census Bureau should continue to investigate the full range of predictors while the panel and the Census Bureau continue to consider for which applications models with various predictors are appropriate.

RESEARCH ON CONTAMINATION DUE TO THE COVERAGE FOLLOW-UP INTERVIEW

Previous PESs initiated their data collection after the conclusion of the census data collection, with the minor exception of telephone A.C.E. interviewing in 2000. In 2000, this meant starting the A.C.E. nontelephone field interviewing after the conclusion of the nonresponse follow-up and the CEFU and CIFU interviewing. This was done for two reasons: (1) to avoid the possibility that the A.C.E. interview might impact the still incomplete census operations, thereby causing the PES blocks to be unrepresentative of the census, and (2) so that the evaluation that A.C.E. provided was of the complete census. However, the wait to begin the A.C.E. interviews increased the number of movers in the period between the census and the A.C.E., which reduced the quality of the data collected for A.C.E.

Any impact of the PES (CCM) interview (or other PES operations) on census processes in the PES blocks is a type of contamination. One way in which contamination might operate is if the census follow-up interview were affected by confusion with the already completed CCM interview. One possible impact is a refusal to participate in the census follow-up interview, but one can also posit other more subtle impacts on the census follow-up interview from CCM operations.

The impact of contamination on the entire census is essentially negligible, since the PES blocks represent a very small percentage of the country (less than 1 percent in

Page 59 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

2000). However, given that contamination could result in a census in the CCM blocks that was not representative of the census in the remainder of the country, it might lead to substantial bias in the dual-systems estimates of net undercoverage.

In previous censuses, waiting for the various follow-up interviews and other coverage improvement programs to be completed prior to the collection of PES data was of less concern, since these were, generally speaking, relatively limited operations that could be concluded fairly expeditiously. However, in 2010, as noted above, the CFU interview could potentially involve a large fraction of the households in the United States and take a substantial time to complete. This would push back the CCM interviewing to September or later, resulting in a substantial increase in the number of movers and generally reducing the quality of the data collected in the CCM. Furthermore, there is a substantial similarity to the CFU and the CCM questionnaires, which might increase the possibility of contamination.

This issue has two aspects. The first is assessing the degree to which having the CCM interview precede the CFU interview affects the responses collected in the CFU interview. Attempts to measure contamination in the 1990 and 2000 censuses found no appreciable contamination (see, e.g., Bench, 2002), but, as argued above, the threat of contamination in the 2010 CFU seems more serious. If this contamination is ignorable, then the Census Bureau could let the interviews coexist in the field in 2010, in which case, one would also like to assess the impact of the CFU interview on the quality of the CCM interview.

There were two attempts to measure contamination in the 2006 census test. The first attempt compared interview rates and rates of erroneous enumerations and omissions in the two populations defined by the order of the interviews. This analysis was stratified by the various situations that result in a CFU interview, listed above. However, the measurement of contamination was indirect, and the modest sample size reduced the statistical power of the analysis. In addition, there was a matched pair design in which a second sample using the same sample design as the CCM was selected, using a block geographically proximate to the CCM sampled block. Then the population estimates for the two samples were compared. Again, the small sample size for this study was a concern.

Although it is too late for the 2006 test, the panel was interested in more direct observation of the impact of several proximate interviews used to determine the residents of a household. It is possible for a household to have a form mailed to it, with nonresponse resulting in several attempts to follow up by a field enumerator. If one of the various situations generating a CFU interview occurs, there will be attempts to carry out a CFU interview, and if then selected into the CCM sample, the household will be interviewed again, and finally, if there is a difficulty in matching to the E-sample, the household could be field interviewed a fourth time. To better understand the impact of several interviews occurring close in time and with similar content on respondents, the Census Bureau could carry out a limited test of this during 2007 or during the 2008 dress

Page 60 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

rehearsal. The panel is concerned that after the second interview, the chances either of a refusal or the collection of poor-quality data could increase.

The second aspect of the contamination issue is what to do in 2010 if appreciable contamination is either observed or cannot be ruled out. One might address this problem in several ways (see Kostanich and Whitford, 2005, for a discussion of some of these approaches).

Combine the CFU and the CCM interviews into one multipurpose interview. The panel has some sympathy for this position, given the similarity of the interviews. However, the CCM interview must be an independent count of a housing unit to satisfy the assumptions underlying DSE, whereas the CFU interview is dependent on information received in the initial census questionnaire. It is therefore difficult to combine these interviewing instruments.
Have the CFU interview occur either before or after the CCM interview, but apply the CCM coverage measurement program to the census before the application of the CFU interview. This is referred to as evaluating a truncated census, since the definition of the census for purposes of coverage evaluation is the census that existed prior to the taking of the CFU interview. Any enumerations added by carrying out CFU interviews after the CCM interviews were completed could be treated as “late additions” were treated in 2000, that is, removed from the census for purposes of coverage measurement. A problem with this approach is that if the CFU adds an appreciable number of people, or corrects the enumerations of an appreciable number of people, one is evaluating a truncated census that is substantially different from the actual census. Also, if these additions or corrections are considerably different in coverage error characteristics in comparison with the remainder of the population, that would add a bias to the dual-systems estimates. One could include the CFU interviews that occurred prior to the CCM interviews in the truncated census, in which case the net coverage error models could condition on whether a CFU interview was carried out prior to the CCM interview, which would remove any bias if the P-sample inclusion probabilities depended on the occurrence of the CFU interview (but not on its outcome— for details, see Bell, 2005). Information on what the CFU interview added from outside the CCM blocks also could be used in these models. There are some operational complexities to this idea, including the need to duplicate the formation of relatively large processing files. Finally, as mentioned previously, one is not evaluating the complete census, and therefore to assess components of census coverage error resulting from the application of the CFU, one would need to carry out a separate evaluation study outside the CCM blocks, which is a serious disadvantage.
Do not use the CFU in the CCM blocks. This avoids any contamination, but then the CCM evaluates an incomplete census, with essentially the same problems listed in (2).

Page 61 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

Let the CFU and CCM interviews occur in whatever order they do, and treat contamination as a constant effect times an indicator variable for which of the two interviews comes first for households that have both CFU and CCM interviews. The difficulty with this approach is that it is not clear what the impact will be on whichever interview comes second, so it is not clear that contamination can be effectively modeled through use of a constant effect. For example, contamination might be a function of various characteristics of the household and therefore be subject to various interactions.
Delay the CCM interviews until the CFU interviews are complete. This does solve the contamination problem. However, coverage evaluation interviews that occurred in August 1980 were less useful than those in April due to the large number of movers that occurred during the four-month period. Therefore, this could have a substantial, negative impact on the quality of the CCM data that are collected in 2010, depending on how long one has to wait.

The panel has not yet come to a consensus on this question. The panel was interested in further examination of the implications of a truncated census (option 2) or combining the two instruments (option 1). The Census Bureau believes that the best approach is to delay the CCM interviews until after all CFU interviews are completed (option 5). The basis for this decision was that in this way the Census Bureau will not plan to have a substandard census in any area (which would certainly be true of option 3), and combining the interviews might harm both interviews in option 1. Furthermore, option 4 is unknown and difficult to test prior to the 2010 census. (For more details on the Census Bureau’s views on contamination, see Kostanich and Whitford, 2005.) However, the panel did not find the argument about the difficulty of duplicating census processing files for option 2 compelling, given the current availability of inexpensive computer memory.

The panel does have concerns about not starting the CCM interviews until September 2010, given the increased number of movers that this would create between Census Day and the CCM interview. It is hoped that by expediting certain operations, an August start for the CCM might still be possible. For this reason, it is important to collect good data in 2006 and 2008 on the impact of delays of various length on the number of movers. In this and several other respects, the results from the 2006 census test will inform the Census Bureau’s position on this issue.

SAMPLE DESIGN FOR THE CCM POSTENUMERATION SURVEY

An important question concerning the CCM program is what modifications should be made to the design of A.C.E. in looking toward the CCM in 2010, given the change in objectives in coverage measurement between the 2000 and the 2010 censuses. That is, to what extent can the new goal of process improvement be incorporated into the design of the CCM PES?

Page 62 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

The proposed design for the CCM PES in 2010 is as follows (for details, see Fenstermaker, 2005). The Census Bureau is assuming that the CCM PES will draw a sample of 300,000 housing units, with primary sampling units comprising block clusters. The panel is in support of Recommendation 6.1 of National Research Council (2004b), which supports a PES survey that would produce net coverage estimates of the same precision as that of the 2000 A.C.E. These block clusters are meant to contain around 30 housing units, and the plan is to subsample them in the event that they contain substantially more. These block clusters will be stratified into three categories: (1) medium and large clusters, with some subsampling within large block clusters, (2) American Indian Reservation block clusters, and (3) small block clusters, which will utilize a two-phase design to sample block clusters under a certain size but to retain all small block clusters greater than that. In allocating the sample of 300,000 housing units to states, the general approach will be to sample proportional to the total population of each state. However, each state’s sample will contain a minimum of 60 block clusters, and Hawaii will be allocated 150 block clusters. In addition, there will be a separate American Indian Reservation sample drawn proportionally to the 2000 census count of American Indian and Alaskan Native populations living on American Indian reservations.

The rationale behind the state allocations for the 2010 CCM PES is that this is intended to be a general purpose sample, so any oversampling in comparison to proportional allocation needs to be strongly justified. In addition, the Census Bureau was very satisfied with the 2000 A.C.E. design, which this design roughly duplicates. The Census Bureau has no specific variance requirements for the 2010 CCM estimates, since production of adjusted counts is not anticipated.

The Census Bureau did examine some alternative specifications for the design of the CCM PES, using simulation studies of the quality of the resulting net coverage error estimates and assessment of components of census coverage error, especially estimation of the number of omissions and erroneous enumerations at the national level and for 64 poststrata (see Fenstermaker, 2005). The designs were (1) the design described above, with allocations proportional to total state population, but with a minimum of 60 block clusters per state, and with Hawaii allotted 150 block clusters; (2) similar to (1) except Hawaii is allocated only 60 block clusters; (3) a design in which allocations are made to the four census regions to minimize the variance of estimates of erroneous enumerations, but within regions, allocations are made proportional to state size; and (4) a design in which half of the sample is allocated proportional to the number of housing units within update/leave areas, and half of the sample is allocated proportional to each state’s number of housing units.

Through use of simulations, for each design and PES sample, national estimates were computed of the rate of erroneous enumerations (and the rate of erroneous enumerations with mail returns, with nonresponse follow-up, and with CFU), the nonmatch rate, the omission rate, and the net error rate. Finally, national estimates of the population were computed, along with their standard errors. The same analysis was done at the poststrata level. One hundred replications were used for the simulation study. The

Page 63 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

results supported retention of the design that closely approximated the 2000 A.C.E. design, described above.

The panel has not yet come to a consensus on whether to recommend modifications to this design for the CCM PES in 2010. There is some concern that the allocation of a minimum of 60 block clusters to each state is too linked with the need to provide adjusted counts for states and not as targeted toward measurement of the rates of the four types of census component coverage errors. If it is the case that the households that are more problematic to count can be linked to relatively focused geographic regions, it would be interesting to evaluate a design that oversampled those areas to see the impact on the reliability of measurement of census component error rates. This is similar to design alternative (3) above, but what we are suggesting is more targeted than that.

Furthermore, we also think that the Census Bureau needs to give more consideration to its within-state allocations of block groups. For example, the possibility of oversampling block groups in predominantly minority areas with, say, large percentages of renters is an alternative that deserves further consideration. The panel is also not clear why the Census Bureau is not making greater use of their planning database, which provides an indication of the difficulty of enumerating block groups.

The objective of settling on a sample design for the CCM in 2010 is a difficult task. There are two general objectives of the coverage measurement program for 2010. First, there is the primary objective put forward by the Census Bureau, which is the measurement of census component coverage errors at some unspecified level of geographic and demographic aggregation. Second, there remains the need to measure net coverage error at the level of the poststrata used in 2000 in order to facilitate comparison with the 2000 census. To address the first goal, one would like to target problematic domains. However, one has to guard against unanticipated problems that might appear in previously easy-to-count areas. To do that and to provide estimates of net coverage error across the United States, a less targeted design is needed. These various demands individually argue for very different designs, and mutually accommodating them, to the extent possible, is challenging. The panel anticipates providing much more direction on this question in its final report.

Research on the Use of Administrative Records in Support of Coverage Improvement and Coverage Measurement in 2010

The Census Bureau’s research program has explored decennial uses of administrative records, that is, data collected as a by-product of administering a governmental program, since the 1980s. Possible uses include (1) a purely administrative records census; (2) improving census nonresponse follow-up either by using enumerator follow-up only when administrative records do not contain the required information or, alternatively, using administrative records to complete information for households that do not respond after several attempts by field enumerators; (3) improving the Master Address File using addresses in administrative records; (4) assisting in coverage measurement, for example, through use of triple-systems estimation (a generalization of

Page 64 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

DSE in which the third system is a merged list of individuals from administrative records); and (5) assisting in coverage improvement, for example, by identifying census blocks for which the census count is likely to be of poor quality. We emphasize that the use of administrative records could be the most promising idea for assisting in the measurement of omissions of hard-to-enumerate groups.

However, until recently, partly due to the limited quality of the available administrative records, including the currency of the information, especially for addresses, the computational burden, and concerns about public perceptions, neither these nor other applications of administrative records have been implemented during a decennial census. (An approach to the problem of currency of address can be found in Stuart and Zaslavsky, 2002). As a result, until 2000, there was no comprehensive census test of the use of administrative records for any purpose, although there were earlier assessments of the coverage of merged administrative lists.⁹

Now, however, several of these concerns have been ameliorated. The quality and availability of national administrative records are improving, computing power has increased dramatically, and as a result the very active research group on administrative records at the Census Bureau has achieved some impressive results. The primary program and database, referred to as E-StARS, now has an extract of a validated, merged, unduplicated residential address list with 150 million entries, 80 percent of which are geocoded to census blocks, and another extract of a validated, merged, unduplicated list of residents with demographic characteristics. These lists are approaching the completeness of coverage that might be achieved by a decennial census. Seven national files are merged to create E-StARS, with the Social Security Number Transaction File providing demographic data.

The panel strongly supports this research program, and we think that there is a real possibility that administrative records could and should be used in the 2010 census, either for coverage improvement, for nonresponse follow-up, or for coverage measurement. Potentially feasible uses in the 2010 census include

To improve or evaluate the quality of either the Master Address File or the address list of the postenumeration blocks. The quality of the Master Address File is a key to a successful mailout of the census questionnaires and nonresponse follow-up, and the quality of the independent list that is created in the PES blocks is a key to a successful coverage measurement program. E-StARS provides a list of addresses that could be used in at least two ways. First, the total number of E-StARS addresses for small areas could be checked against the corresponding Master Address File totals or PES totals to identify areas with large discrepancies that could be relisted. Second, more directly, address lists could be matched to identify specific addresses that are missed in either the Master Address File or the

⁹	The Census Bureau operates under the constraint that information obtained from administrative records under confidentiality restrictions cannot be sent out to the field to assist enumerators, which prohibits the use of some applications of administrative records.

Page 65 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

PES address listing, with discrepancies followed up in the field for resolution. Note that while administrative records could be used to improve the address list for either the census or the PES, to maintain independence they should not be used for both.

To assist in late-stage nonresponse follow-up. The Census Bureau makes several attempts to collect information from mail nonrespondents to the census form. When these attempts fail to collect information, attempts are made to locate a proxy respondent and, when that fails, hot-deck imputation is used to fill in whatever information is needed, including the residence’s vacancy status and the household’s number of residents. If the quality of E-StARS information is found to be at least as good as that from hot-deck imputation or even proxy interviews, it might be effective to attempt to match nonrespondents to E-StARS before either pursuing a proxy interview or using hot-deck imputation. Especially with a short-form-only census, E-StARS might be sufficiently complete and accurate for this purpose. (It may ultimately be discovered, possibly during an experiment in 2010, that fewer attempts at collecting nonresponse data are needed by making use of E-StARS information after, for example, only one or two attempts at nonresponse follow-up, thereby shortening and reducing the costs of nonresponse follow-up.)
For item imputation. The Census Bureau often uses item imputation to fill in modest amounts of item nonresponse. Item nonresponse could affect the ability to match a P-sample individual to the E-sample, and missing demographic and other information may result in an individual being placed in the wrong poststratum. Item imputation based on information from E-StARS may be preferable to hot-deck imputation. The use of E-StARS to provide item imputation is currently being tested as part of the 2006 census test.
To improve targeting of the coverage improvement follow-up interviews. The coverage improvement interview in 2010, as currently planned, will follow up households with any of the following six conditions: (1) uncertain vacancy status, (2) characteristics for additional people in large households, (3) resolution of count discrepancies, (4) duplicate resolution, (5) persons who may have been enumerated at other residences other than the one in question, and (6) nonresidents who sometimes stayed at the housing unit in question. The workload for this operation might well exceed the Census Bureau’s capacity to carry out the necessary fieldwork, given limited time and resources. Administrative records could possibly be used to help identify situations in which field resolution is not needed, for example, by indicating which of a set of duplicates is at the proper residence. (Uses of E-StARS like this are being attempted in the 2006 census test.)
To help determine the status of a nonmatch prior to follow-up of nonmatches in the PES. It is very possible that nonmatches of the P-sample to the census may be resolved, for example, by indicating that there was a geocoding error or a misspelled name through the use of administrative records, thereby saving the expense and time of additional CCM fieldwork.
To evaluate the census coverage measurement program. Many of the steps leading to production of dual-systems estimates might be checked using

Page 66 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

administrative records. For example, administrative records information might be used to assess the quality of the address list in the P-sample blocks, to assess the quality of the matching operation, or to assess the quality of the small-area estimation of population counts. (However, any operation that makes use of administrative records cannot also use the same administrative records for purposes of evaluation.)

The administrative records group at the Census Bureau has already had a number of successful applications of E-StARS. First, an administrative records census was conducted in five counties during the 2000 census, and its quality was judged to be comparable to that of the census in those counties. Second, E-StARS was used to explain 85 percent of the discrepancies between the Maryland Food Stamp Registry recipients and estimates from the Census Supplementary Survey in 2001 (the pilot American Community Survey).

The panel considers this important and promising research that should play a key role in censuses beginning in the year 2020, given the potential for cost savings and quality improvement. With respect to use in 2010, since the various suggestions depend crucially on the quality of the merged and unduplicated lists of addresses and people in E-StARS, the use of E-StARS for any of the above purposes in 2010 will require further examination of the quality of the lists, as well as evaluation of the specific application in comparison to the current method used in the census. Until there are rigorous operational tests of both feasibility and effectiveness, it would not be reasonable to move toward implementation in 2010. Given where we are in the decade, it is unlikely that more than one of the above six bulleted applications could have sufficient resources devoted to support incorporation in the 2008 dress rehearsal, which is a necessity for implementation in 2010. Therefore there is a need to focus immediately on one very specific proposal.

The panel recommends that one of the above applications be developed sufficiently to support a rigorous test in the 2008 dress rehearsal with the goal of implementation in 2010 should the subsequent evaluation support its use. Furthermore, the Census Bureau should begin now to design rigorous tests of all the above suggestions for the use of administrative records, very possibly during the 2010 census, as a first step toward decennial census application of administrative records in 2020. We think that administrative records have great promise for assisting in understanding census omissions and therefore need to be used either for evaluation of the CCM or as a part of the CCM program.

Recommendation 2: The Census Bureau should choose one or more of the proposed uses of administrative records (e.g., tax record data or state unemployment compensation data) for coverage improvement, nonresponse follow-up, or coverage measurement and comprehensively test those applications during the 2008 census dress rehearsal. If a process using administrative records improves on processes used in 2000, that process should be implemented in the 2010 census.

Page 67 Cite

Suggested Citation:"3 Assessment of the Census Bureau’s Current Research Program for Coverage Evaluation in 2010." National Research Council. 2007. Research and Plans for Coverage Measurement in the 2010 Census: Interim Assessment. Washington, DC: The National Academies Press. doi: 10.17226/11941.

×

We add that evaluations of the use of administrative records are often viewed as involving extensive, resource-intensive fieldwork. However, while fieldwork needs to be involved to some extent, much evaluation of administrative records can be accomplished if the Census Bureau structures its various databases collected from test censuses in a way that facilitates matching.

Furthermore, if data from E-StARS are used successfully in 2010, the Census Bureau should consider more ambitious uses of administrative data in the 2020 census. Specifically, the Census Bureau might use administrative data to replace the nonresponse follow-up interview for many housing units, not just late-stage nonresponse. Under this proposal, the Census Bureau would use data from administrative records to determine the occupancy status of some nonresponding housing units and the number and characteristics of its residents. To do so, the Census Bureau would have to develop criteria of adequacy of the information in the administrative records to establish the existence and membership of the household for this purpose. For example, agreement of several records of acceptable currency and quality might be considered sufficient to use the information as a substitute for a census enumeration, which would reduce the burden of field follow-up.

This would represent a substantial change in what constitutes a census enumeration, of at least the same conceptual magnitude as the change from in-person to mail enumerations as the primary census methodology. However, given that the completeness of administrative systems and the capabilities of matching and processing administrative records has been growing, while cooperation with field operations has declined, these contrasting trends make it increasingly likely that administrative records can soon provide enumerations of quality at least as good as field follow-up for some housing units. Furthermore, unlike purely statistical adjustment methods, every census enumeration would correspond to a specific person for whom there is direct evidence of his or her residence and their characteristics. The long-run potential for such broader contributions from administrative records is a reason to give high priority to their application in the 2010 census, in addition to their direct benefits in that census.

Two possible objections might be raised in opposition to this approach. First, this use of administrative records may be ruled to be inconsistent with interpretations of what a census entails in the Constitution. Second, public acknowledgment that this method is being used might have a negative impact on the level of cooperation with census-taking. These two issues would need to be resolved before the Census Bureau could go forward. Also, this is clearly dependent on the success of the more modest efforts suggested for possible use in 2010.