Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 185
The 2000 Census: Counting Under Adversity CHAPTER 6 The 2000 Coverage Evaluation Program WE TRUN NOW TO THE FINAL REVISION II ESTIMATES of the population from the 2000 Accuracy and Coverage Evaluation (A.C.E.) Program. By early 2003 the Census Bureau had completed extensive reanalyses of the evaluation data sets used for the October 2001 preliminary revised A.C.E. estimates of net undercount. It released a new set of detailed estimates based on the original A.C.E. and evaluation data—the A.C.E. Revision II estimates—at a joint public meeting of our panel and the Panel on Research on Future Census Methods on March 12, 2003 (see http://www.census.gov/dmd/www/Ace2.html [12/22/03]). These estimates showed a small net overcount of 0.5 percent of the total population instead of a net undercount of 1.2 percent as originally estimated from the A.C.E. in March 2001. The latest demographic analysis estimates still showed a net undercount of the population, but it was negligible (0.1 percent) (Robinson, 2001b:Table 2). At the joint panel meeting Census Bureau officials announced the Bureau’s recommendation, accepted by the secretary of commerce, to produce postcensal population estimates on the basis of the official 2000 census results. The A.C.E. Revision II estimates would not be used as the basis for developing estimates throughout the decade
OCR for page 186
The 2000 Census: Counting Under Adversity of the 2000s. The decision document (U.S. Census Bureau, 2003b:1) stated: The Accuracy and Coverage Evaluation (A.C.E.) Revision II methodology represents a dramatic improvement from the previous March 2001 A.C.E. results. However, several technical errors remain, including uncertainty about the adjustment for correlation bias, errors from synthetic estimation, and inconsistencies between demographic analysis estimates and the A.C.E. Revision II estimates of the coverage of children. Given these technical concerns, the Census Bureau has concluded that the A.C.E. Revision II estimates should not be used to change the base for intercensal population estimates. With the final decision not to adjust population estimates for measured net undercount in the 2000 census behind it, the Census Bureau announced its intention to focus exclusively on planning for the 2010 census. Plans for that census include work on possibly using computerized matching of the type conducted for the October 2001 and March 2003 adjustment decisions to eliminate duplicate enumerations as part of the census process itself. Bureau officials also expressed the view that coverage evaluation could not be completed and evaluated in a sufficiently timely fashion to permit adjustment of the data used for legislative redistricting (Kincannon, 2003). In this chapter we assess the A.C.E. Revision II estimation methodology and the resulting estimates of population coverage in the 2000 census. We first review key aspects of the original A.C.E. Program (6-A) and then review the data sources, methods, and results of the A.C.E. Revision II effort (6-B). We discuss two kinds of enumerations that had more impact on coverage in 2000 than in 1990: (1) whole-person (including whole-household) imputations and (2) duplicate enumerations in the census and the A.C.E. (6-C). Section 6-D provides an overall summary of what we know and do not know about population coverage in 2000. Section 6-E provides our recommendations for coverage evaluation research and development for 2010. 6–A ORIGINAL A.C.E. DESIGN AND OPERATIONS An important part of evaluating the Revision II A.C.E. population estimates for 2000 is to consider how well the original A.C.E.
OCR for page 187
The 2000 Census: Counting Under Adversity Program was designed and executed to produce the underlying input data. We consider below 10 aspects of the original A.C.E. followed by a summary of findings in Section 6-A.11: basic design; conduct and timing; definition of the P-sample—treatment of movers; definition of the E-sample—exclusion of “insufficient information” cases; household noninterviews in the P-sample; imputation for missing data in the P-sample and E-sample; accuracy of household residence information in the P-sample and E-sample; quality of matching; targeted extended search; and poststratification. 6–A.1 Basic Design The design of the 2000 A.C.E. was similar to the 1990 Post-Enumeration Survey (PES). The goal of each program was to provide a basis for estimating two key components of the dual-systems estimation (DSE) formula for measuring net undercount or overcount in the census (see Section 5-A). They are: the match rate, or the rate at which members of independently surveyed households in a sample of block clusters (the P-sample) matched to census enumerations, calculated separately for population groups (poststrata) and weighted to population totals, and the correct enumeration rate, or the rate at which census enumerations in the sampled block clusters (the E-sample) were correctly included in the census (including both matched cases and nonmatched correct enumerations), calculated separately for poststrata and weighted to population totals.1 1 The E-sample by design excluded some census cases in the A.C.E. block clusters (see Appendix E.1.e and E.3).
OCR for page 188
The 2000 Census: Counting Under Adversity Other things equal, the higher the match rate, the lower will be the DSE population estimate and the estimated net undercount in the census. Conversely, the more nonmatches, the higher will be the DSE population estimate and the estimated net undercount. In contrast, the higher the correct enumeration rate, the higher will be the DSE population estimate and the estimated net undercount. Conversely, the more erroneous enumerations, the lower will be the DSE population estimate and the estimated net undercount (for how this result obtains, refer to Equation 5.1 in Section 5-A). The A.C.E. and PES design and estimation focused on estimating the net undercount and not on estimating the numbers or types of gross errors of erroneous enumerations (overcounts) or of gross omissions. There are not widely accepted definitions of components of gross error, even though such errors are critically important to analyze in order to identify ways to improve census operations. Some types of gross errors depend on the level of geographic aggregation. For example, assigning a census household to the wrong small geographic area (geocoding error) is an erroneous enumeration for that area (and an omission for the correct area), but it is not an erroneous enumeration (or an omission) for larger areas. Also, the original A.C.E. design, similar to the PES, did not permit identifying duplicate census enumerations as such outside a ring or two of blocks surrounding a sampled block cluster. On balance, about one-half of duplicate enumerations involving different geographic areas should be classified as an “other residence” type of erroneous enumeration at one of the two addresses because the person should have been counted only once, but this balancing may not be achieved in practice. Several aspects of the original A.C.E. design were modified from the PES design in order to improve the timeliness and reduce the variance and bias of the results (see Section 5-D.1). Some of these changes were clearly improvements. In particular, the larger sample size (300,000 households in the A.C.E. compared with 165,000 households in the PES) and the reduction in variation of sampling rates considerably reduced the variance of the original A.C.E. estimates compared with the PES estimates (see Starsinic et al., 2001). The coefficient of variation for the originally estimated coverage correction factor of 1.012 for the total population in 2000 was 0.14 percent, a reduction of 30 percent from the comparable coefficient of variation
OCR for page 189
The 2000 Census: Counting Under Adversity in 1990.2 (The 0.14 percent coefficient of variation translates into a standard error of about 0.4 million people in the original DSE total household population estimate of 277.5 million.) The coefficients of variation for the originally estimated coverage correction factors for Hispanics and non-Hispanic blacks were 0.38 and 0.40 percent, respectively, in 2000, reduced from 0.82 and 0.55 percent, respectively, in 1990 (Davis, 2001:Tables E-1, F-1). However, some poststrata had coefficients of variation as high as 6 percent in 2000, which translates into a large confidence interval around the estimate of the net undercount for these poststrata and for any geographic areas in which they are a large proportion of the population. Three other improvements in the A.C.E. design deserve mention. First, an initial housing unit match of the independent P-sample address listing with the Master Address File (MAF) facilitated subsequent subsampling, interviewing, and matching operations. Second, the use of computer-assisted interviewing—by telephone in the first wave—facilitated timeliness of the P-sample data, which had the positive effect of reducing the percentage of movers compared with the 1990 PES (see Section 6-A.3). Third, improved matching technology and centralization of matching operations probably contributed to a higher quality of matching than achieved in 1990 (see Section 6-A.8). Another innovation—the decision to target the search for matches and correct enumerations in surrounding blocks more narrowly in the A.C.E. than in the PES—was originally suspected of having contributed balancing errors to the original (March 2001) DSE estimates, but subsequent evaluation allayed that concern (see Section 6-A.9). The treatment of movers was more complex in the A.C.E. than in the 1990 PES, but, primarily because there were proportionately fewer movers in the A.C.E. compared with the PES, on balance, movers had no more effect on the dual-systems estimates for 2000 than on those for 1990 (see Section 6-A.3). In retrospect, the decision to exclude the group quarters population from the A.C.E. universe (see Section 5-D.1) was unfortunate, as it precluded the development of coverage estimates for group 2 The coefficient of variation (CV) is the standard error of an estimate as a percentage of the estimate. The coverage correction factor, which would be used in implementing an adjustment, is the dual-systems estimate divided by the census count (including whole-person imputations and late additions); see Section 5-A.
OCR for page 190
The 2000 Census: Counting Under Adversity quarters residents, who appear to have been poorly enumerated (see Section 4-F). Finally, a flaw in the planning for not only the A.C.E., but also the census itself, was the failure to anticipate the extent to which certain groups of people (e.g., college students, prisoners, people with two homes) would be duplicated in more than one census record (see Section 6-A.7). 6–A.2 Conduct and Timing Overall, the original A.C.E. was well executed in terms of timely and well-controlled address listing, P-sample interviewing, matching, follow-up, and original estimation. Although the sample size was twice as large as that fielded in 1990, the A.C.E. was carried out on schedule and with only minor problems that necessitated rearrangement or modification of operations after they had been specified. Mostly, such modifications involved accommodation to changes in the MAF that occurred in the course of the census. For example, the targeted extended search (TES) procedures had to be modified to handle deletions from and additions to the MAF that were made after the determination of the TES housing unit inventory (Navarro and Olson, 2001:11). Some procedures proved more useful than had been expected. In particular, the use of the telephone (see Appendix E.2) enabled P-sample interviewing to begin April 24, 2000, whereas P-sample interviewing for the PES did not begin until June 25, 1990. All A.C.E. processes, from sampling through estimation, were carried out according to well-documented specifications, with quality control procedures (e.g., reviews of the work of clerical matchers and field staff) implemented at appropriate junctures. 6–A.3 Defining the P-Sample: Treatment of Movers The A.C.E. P-sample, partly because of design decisions made for the previously planned Integrated Coverage Measurement Program (see Section 5-D.1), included three groups of people and not two as in the 1990 PES. The three groups were: nonmovers who lived in a P-sample housing unit on Census Day and on the A.C.E. interview day; outmovers who lived in a P-sample housing unit on Census Day but had left by the A.C.E. interview day; and inmovers
OCR for page 191
The 2000 Census: Counting Under Adversity who moved into a P-sample housing unit between Census Day and the A.C.E. interview day. In the dual-systems estimation for each poststratum (population group), the number of matched movers was calculated by applying the estimated match rate for outmovers to the weighted number of inmovers.3 This procedure, called PES-C, assumed that inmovers would be more completely reported than outmovers. The Bureau also anticipated that it would be easier to ascertain the Census Day residence status of outmovers than to search nationwide for the Census Day residence of inmovers, as was done in the 1990 PES using the PES-B procedure. An analysis of movers in the A.C.E. P-sample conducted by the Census Bureau in summer 2001 supported the assumption of more complete reporting of inmovers: the total weighted number of outmovers was only two-thirds (0.66) the total weighted number of inmovers (the outmover/inmover ratio varied for broad population groups from less than 0.51 to more than 0.76—Liu et al., 2001:App.A). A subsequent evaluation found little effect on the dual-systems population estimates of using inmovers to estimate the number of movers (Keathley, 2002). Noninterview and missing data rates were substantially higher for outmovers compared with inmovers and nonmovers (see Sections 6-A.5 and 6-A.6), so one might have expected to see an increase in variance of the dual-systems estimates from using outmovers to estimate match rates from the P-sample. Yet Liu et al. (2001) found that movers had no more and probably less of an effect on the dual-systems estimates in 2000 than in 1990. A primary reason for this result is that the percentage of movers among the total population was lower in the A.C.E. than in the PES—using the number of inmovers in the numerator, the A.C.E. mover rate was 5.1 percent, compared with a mover rate of 7.8 percent in the PES. In turn, the lower A.C.E. mover rate resulted from the 2-month head start that was achieved by telephone interviewing in the A.C.E. (see Section 6-A.2). The mover rate for A.C.E. cases interviewed after June 25, 2000, was comparable to the PES mover rate (8.2 and 7.8 percent, respectively); the mover rate for A.C.E. cases interviewed before June 25, 2000, was only 2.1 percent (Liu et al., 2001:5). 3 For 63 poststrata with fewer than 10 outmovers, the weighted number of outmovers was used instead.
OCR for page 192
The 2000 Census: Counting Under Adversity 6–A.4 Defining the E-Sample: Exclusion of “Insufficient Information” Cases Dual-systems estimation in the census context requires that census enumerations be excluded from the E-sample when they have insufficient information for matching and follow-up (so-called IIs—see Section 5-A). The 2000 census had almost four times as many IIs as the 1990 census—8.2 million, or 2.9 percent of the household population, compared with 2.2 million or 0.9 percent of the population. In 2000 5.8 million people fell into the II category because they were whole-person imputations (types 1–5, as described in Section 4-D); another 2.4 million people were IIs because their records were not available in time for the matching process. These people were not in fact enumerated late; rather, they represented records that were temporarily deleted and subsequently reinstated on the census file as part of the special MAF unduplication process in summer–fall 2000 (see Section 4-E). In 1990 only 1.9 million whole-person imputations and 0.3 million late additions from coverage improvement programs fell into the II category. Because the phenomenon of reinstated cases in the 2000 census was new and the number of such cases was large, the Bureau investigated the possible effects of their exclusion from the E-sample on the dual-systems estimate. Hogan (2001b) demonstrated conceptually that excluding the reinstated people would have little effect so long as they were a small percentage of census correct enumerations or their A.C.E. coverage rate (ratio of matches to all correct enumerations) was similar to the E-sample coverage rate. To provide empirical evidence, a clerical matching study was conducted in summer 2001 of reinstated people whose census records fell into an evaluation sample of one-fifth of the A.C.E. block clusters (Raglin, 2001). This study found that 53 percent of the reinstated records in the evaluation sample duplicated another census record (and, hence, had no effect on the DSE), 25 percent matched to the P-sample, and 22 percent were unresolved (such a large percentage resulted from the infeasibility of follow-up to obtain additional information). Using a range of correct enumeration rates for the unresolved cases, the analysis demonstrated that the exclusion of reinstated records from the E-sample had a very small effect on the DSE for the total population (less than one-tenth of 1 percent). Moreover, because
OCR for page 193
The 2000 Census: Counting Under Adversity total and matched reinstated cases were distributed in roughly the same proportions among age, sex, race/ethnicity, and housing tenure groups, their exclusion from the E-sample had similar (negligible) effects on the DSE estimates for major poststrata. Nonetheless, the large number of IIs in 2000 cannot be ignored in understanding patterns of net undercount. Although reinstated cases accounted for roughly the same proportion of each major poststratum group (about 1 percent) in 2000, whole-person imputations accounted for higher proportions of historically undercounted groups, such as minorities, renters, and children, than of historically better counted groups. We consider the role of whole-person imputations in helping to account for the measured reduction in net undercount rate differences among major population groups in 2000 from 1990 in Section 6-C.1. 6–A.5 Household Noninterviews in the P-Sample The P-sample survey is used to estimate the match rate component of the dual-systems estimation formula. A small bias in the match rate can have a disproportionately large effect on the estimated net undercount (or overcount) because coverage error is typically so small relative to the total population (1–2 percent or less). To minimize variance and bias in the estimated match rate, it is essential that the A.C.E. successfully interview almost all P-sample households and use appropriate weighting adjustments to account for noninterviewed households. Interview/Noninterview Rates Overall, the A.C.E. obtained interviews from 98.9 percent of households that were occupied on the day of interview. This figure compares favorably with the 98.4 percent interview rate for the 1990 PES.4 However, the percentage of occupied households as of Census Day that were successfully interviewed in A.C.E. was somewhat lower—97 percent, meaning that a weighting adjustment had to account for the remaining 3 percent of noninterviewed households. 4 These percentages are unweighted; they are about the same as weighted percentages for 2000. Weighted percentages are not available for 1990 (see Cantwell et al., 2001).
OCR for page 194
The 2000 Census: Counting Under Adversity The lower interview rate for Census Day households was due largely to the difficulty of finding a respondent for housing units in the P-sample that were entirely occupied by people who moved out between the time of the census and the A.C.E. interview (outmovers). Such units were often vacant, and it was not always possible to interview a neighbor or landlord who was knowledgeable about the Census Day residents. The interview rate for outmover households was 81.4 percent. Such households comprised 4 percent of Census Day occupied households in the P-sample. Noninterview Weighting Adjustments Two weighting adjustments—one for the A.C.E. interview day and one for Census Day—were calculated so that interviewed households would represent all households that should have been interviewed. Each of the two weighting adjustments was calculated separately for households by type (single-family unit, apartment, other) within block cluster. For Census Day, what could have been a relatively large noninterview adjustment for outmover households in a block cluster was spread over all interviewed Census Day households in the cluster for each of the three housing types. Consequently, adjustments to the weights for interviewed households were quite low, which had the benefit of minimizing the increase in the variance of A.C.E. estimates due to differences among weights: 52 percent of the weights were not adjusted at all because all occupied households in the adjustment cell were interviewed; for another 45 percent of households, the weighting adjustment was between 1.0 and 1.2 (Cantwell et al., 2001:Table 2). Evaluation Although the P-sample household noninterview adjustments were small, a sensitivity analysis determined that alternative weighting adjustments could have a considerable effect on the estimated value of the DSE for the national household population. Three alternative noninterview adjustments were tested: assigning weights on the basis of characteristics other than those used in the original A.C.E. estimation, assigning weights only to
OCR for page 195
The 2000 Census: Counting Under Adversity late-arriving P-sample interviews, and replicating the weights for missing interviews from nearby households. All three alternatives produced weighted total household population estimates that were higher than the original (March 2001) DSE estimate (Keathley et al., 2001:Table 4). Two of the alternative estimates exceeded the original estimate by 0.5–0.6 million people, which translates into an added 0.2 percentage points of net undercount on a total household population of 277.2 million. The differences between these two estimates and the original estimate also exceeded the standard error of the original estimate, which was 0.4 million people. 6–A.6 Missing and Unresolved Data in the P-Sample and E-Sample Missing and unresolved person data can bias the estimated P-sample match rate, the estimated E-sample correct enumeration rate, or both rates. Imputation procedures used to fill in missing values can also add bias and variance, so achieving high-quality P-sample and E-sample data is critical for dual-systems estimation. Missing Characteristics Needed for Poststratification Overall rates of missing characteristics data in the P-sample and E-sample were low, ranging between 0.2 and 3.6 percent for age, sex, race, Hispanic origin, and housing tenure. Missing data rates for most characteristics were somewhat higher for the E-sample than for the P-sample. Missing data rates for the 2000 A.C.E. showed no systematic difference (up or down) from the 1990 PES; see Table 6.1. As would be expected, missing data rates in the P-sample were higher for proxy interviews, in which someone outside the household supplied information, than for interviews with household members; see Table 6.2. By mover status, missing data rates were much higher for outmovers than for nonmovers and inmovers, which is not surprising given that 73.3 percent of interviews for outmovers were obtained from proxies, compared with only 2.9 and 4.8 percent of proxy interviews for nonmovers and inmovers, respectively. Even “nonproxy” interviews for outmovers could have been from household members who did not know the outmover.
OCR for page 258
The 2000 Census: Counting Under Adversity error for some geographic areas. We understand that the revised E-sample poststrata better explained variations in correct enumeration rates compared with the original E-sample poststrata. However, there were no logical counterparts on the P-sample side for some of the E-sample poststrata (including those based on proxy response and type of return), and the use of different poststrata could have introduced bias for some estimates. Because of these sources of uncertainty, our view is that the Census Bureau’s decision not to use the A.C.E. Revision II estimates to adjust the census data that provide the basis for postcensal estimates was justified. A consideration in our agreement with the bureau’s decision against adjustment was the finding that estimated net undercount rates and differences in net undercount rates for population groups (and, most probably, subnational areas) were smaller in 2000 than in 1990. The smaller the measured net coverage errors in the census, the smaller would be the effects of an adjustment on important uses of the data. Because the benefits of an adjustment would be less when net coverage errors are small, a high level of confidence is needed that an adjustment would not significantly increase the census errors for some areas and population groups. In our judgment, the A.C.E. Revision II estimates, given the constraints of the available data for correcting the original A.C.E. estimates, are too uncertain for use in this context. We do not intend by this conclusion, however, to set a standard of perfection whereby it would never be possible to carry out an adjustment that improved on the census counts. Indeed, had it been possible to implement the A.C.E. Revision II methodology from the outset on the original A.C.E. data and to make some other improvements in the estimation (see Section 6-E), it is possible that an adjustment of the 2000 census data could have been implemented that was well supported. 6–D.9 Revision II Coverage Evaluation Findings See also Section 6-A.11. Finding 6.2: The Census Bureau commendably dedicated resources to the A.C.E. Revision II effort, which completely reestimated net undercount (and overcount) rates for several hundred population groups (poststrata)
OCR for page 259
The 2000 Census: Counting Under Adversity by using data from the original A.C.E. and several evaluations. The work exhibited high levels of creativity and effort devoted to a complex problem. From innovative use of matching technology and other evaluations, it provided substantial additional information about the numbers and sources of erroneous census enumerations and, similarly, information with which to correct the residency status of the independent A.C.E. sample. It provided little additional information, however, about the numbers and sources of census omissions. Documentation for the original A.C.E. estimates (March 2001), the preliminary revised estimates (October 2001), and the A.C.E. Revision II estimates (March 2003) was timely, comprehensive, and thorough. Finding 6.3: We support the Census Bureau’s decision not to use the March 2003 Revision II A.C.E. coverage measurement results to adjust the 2000 census base counts for the Bureau’s postcensal population estimates program. The Revision II results are too uncertain to be used with sufficient confidence about their reliability for adjustment of census counts for subnational geographic areas and population groups. Sources of uncertainty stem from the small samples of the A.C.E. data that were available to correct components of the original A.C.E. estimates of erroneous enumerations and non-A.C.E. residents and to correct the original estimate of nonmatches and the consequent inability to make these corrections for other than very large population groups; the inability to determine which of each pair of duplicates detected in the A.C.E. evaluations was correct and which should not have been counted in the census or included as an A.C.E. resident; the possible errors in subnational estimates from the choice of one of several alternative correlation bias adjustments to compensate for higher proportions of missing men relative to women; the inability to make correlation bias adjustments for population groups other than blacks and nonblacks; and the possible errors for some small
OCR for page 260
The 2000 Census: Counting Under Adversity areas from the use of different population groups (poststrata) for estimating erroneous census enumerations and census omissions. In addition, there is a large discrepancy in coverage estimates for children ages 0–9 when comparing demographic analysis estimates with Revision II A.C.E. estimates (2.6 percent undercount and 0.4 percent net overcount, respectively). Finding 6.4: Demographic analysis helped identify possible coverage problems in the 2000 census and in the A.C.E. at the national level for a limited set of population groups. However, there are sufficient uncertainties in the revised estimates of net immigration (particularly the illegal component) and the revised assumption of completeness of birth registration after 1984, compounded by the difficulties of classifying people by race, so that the revised demographic analysis estimates cannot and should not serve as the definitive standard of evaluation for the 2000 census or the A.C.E. Finding 6.5: Because of significant differences in methodology for estimating net undercount in the 1990 Post-Enumeration Survey Program and the 2000 Accuracy and Coverage Evaluation Program (Revision II), it is difficult to compare net undercount estimates for the two censuses. Nevertheless, there is sufficient evidence (from comparing the 1990 PES and the original A.C.E.) to conclude that the national net undercount of the household population and net undercount rates for population groups were reduced in 2000 from 1990 and, more important, that differences in net undercount rates between historically less-well-counted groups (minorities, children, renters) and others were reduced as well. From smaller differences in net undercount rates among groups and from analysis of available information for states and large counties and places, it is reasonable to infer that differences in net undercount rates among geographic areas were also probably smaller in 2000 compared with 1990. Despite reduced differences in
OCR for page 261
The 2000 Census: Counting Under Adversity net undercount rates, some groups (e.g., black men and renters) continued to be undercounted in 2000. Finding 6.6: Two factors that contributed to the estimated reductions in net undercount rates in 2000 from 1990 were the large numbers of whole-person imputations and duplicate census enumerations, many of which were not identified in the original (March 2001) A.C.E. estimates. Contributing to duplication were problems in developing the Master Address File and respondent confusion about or misinterpretation of census “usual residence” rules, which resulted in duplication of household members with two homes and people who were enumerated at home and in group quarters. 6–E RECOMMENDATIONS FOR COVERAGE EVALUATION IN 2010 6–E.1 An Improved Accuracy and Coverage Evaluation Program The complexities of the original A.C.E. and Revision II reestimation and the uncertainties about what the Revision II results tell us about net and gross coverage errors in the 2000 census could lead policy makers to question the value of an A.C.E.-type coverage evaluation program for the 2010 census. To the contrary, we recommend that research and development for the 2010 census give priority to improving an A.C.E.-type coverage evaluation mechanism and that it be implemented in 2010. Without the 2000 original A.C.E. and Revision II estimation, we would not have acquired so much valuable information about strengths and weaknesses of the 2000 census. In particular, differences between the census counts, the original A.C.E., and the original demographic analysis estimates spurred the development of innovative methods for identifying duplicate census enumerations. These differences also motivated a reexamination of assumptions about immigration estimates in demographic analysis. The plans for the 2010 census include the serious possibility that the matching methods used in the Further Study of Person Duplication would be used as part of the enumeration process, so that
OCR for page 262
The 2000 Census: Counting Under Adversity duplicate enumerations could be identified, followed up, and eliminated from the census counts in real time (Smith and Whitford, 2003).22 If these plans reach fruition, then the 2010 census could be expected to have many fewer erroneous enumerations than the 2000 census. Because it is difficult to imagine the development of effective new ways of reducing census omissions, then a reduction in erroneous enumerations could well result in a significant net undercount in the 2010 census and an increase in differential undercoverage among population groups. Without an A.C.E.-type coverage evaluation program, it would not be possible to measure such an undercount or to adjust some or all of the census counts for coverage errors should that be warranted. Demographic analysis, while providing useful information about census coverage at the national level for a few population groups, could not substitute for an A.C.E. We urge that the 2010 census testing program give priority to research and development for an improved A.C.E.-type coverage evaluation program. We see possibilities for improvements in many areas, such as the estimation of components of gross census error as well as net error, expansion of the search area for erroneous census enumerations and P-sample nonresidents, the inclusion of group quarters residents, better communication to respondents of residence rules (and reasons for them), understanding the effects of IIs on A.C.E. estimation, the treatment of movers, and the development of poststrata. The optimal size of a 2010 A.C.E. is also a consideration. The survey must be large enough to provide estimates of coverage errors with the level of precision that was targeted for the original (March 2001) A.C.E. estimates for population groups and geographic areas. With regard to the estimates of erroneous enumerations and P-sample nonresidents outside the traditional search area, the nationwide matching technology developed for the 2000 A.C.E. Revision II would make it possible to incorporate the search for such errors 22 Some observers may be concerned about privacy issues with regard to the capture of names on the computerized census records and their use for matching on such a large scale. The panel did not consider this issue, but we note that the Census Bureau has always been sensitive to such concerns, and Title 13 of the U.S. Code safeguards the data against disclosure.
OCR for page 263
The 2000 Census: Counting Under Adversity as part of the 2010 A.C.E. production process, not waiting for subsequent evaluation. Such cases could be identified and followed up in real time, similar to what is planned for the census itself. Such a procedure could not only significantly reduce errors in the A.C.E., it could also provide valuable information about types of gross errors that has not previously been available. The nationwide matching technology together with the possible increased use of administrative records for group quarters enumeration, could make it possible to include group quarters residents in the 2010 A.C.E. with an acceptable level of data quality. Administrative records for such group quarters as college dormitories, prisons, nursing homes, and other institutions could provide home addresses for use in the matching to identify duplicate enumerations. With this information, the follow-up to verify a duplicate enumeration of a college student, for example, would simply need to establish that the student was in fact the same person, and the census residence rules would then be applied to designate the group quarters enumeration as correct and the home enumeration as erroneous. There would be no need to make the family choose the student’s “usual residence.” With regard to communication of residence rules, cognitive research on the A.C.E. questionnaires and interviewer prompts could lead to interviewing procedures that better help respondents understand the Bureau’s rules and reasons for them. The 2000 A.C.E. demonstrated that it is not enough to rely on respondents’ willingness to follow the rules (e.g., when parents report a college student at home), which is a major reason for incorporating nationwide matching technology into the 2010 A.C.E. process. However, cognitive research could probably improve the interview process in ways that would improve the quality of residence information reported to the A.C.E. (Such research is also likely to improve the census enumeration.) Furthermore, if plans to use mobile computing devices and global positioning system (GPS) technology for address listing and nonresponse follow-up in 2010 come to fruition, then there is the opportunity to reduce geocoding errors in the E-sample. Such technology could also be used to minimize geocoding errors in the listing operations conducted to build the independent P-sample address list.
OCR for page 264
The 2000 Census: Counting Under Adversity With regard to understanding the effects of census records that are excluded from the A.C.E. matching (IIs), such records in 2010 would probably be whole-person imputations. The large number of records that were reinstated from the special summer 2000 housing unit unduplication operation should not affect 2010, given that matching to detect duplicate addresses (and people) will probably be built into the enumeration process. However, there could still be large numbers of whole-person and whole-household imputations. In order to more fully understand the effects of an adjustment for population groups and geographic areas, it is important to analyze the distribution of such imputations and, in particular, how they may affect synthetic error. For movers, with the nationwide matching capability that has been developed, it should be possible to use the PES-B procedure for a 2010 A.C.E., instead of the cumbersome PES-C procedure that was used in 2000. The speed of the 2000 P-sample interviewing reduced the number of movers and, consequently, their effects on the dual-systems estimates, but the quality of their data was markedly below that for nonmovers and inmovers. Finding where inmovers resided on Census Day would be greatly facilitated by nationwide matching, so that a PES-B procedure would be feasible and likely to provide improved data quality compared with its use in the 1990 PES. Finally, with regard to poststratification, the Revision II effort to revise the E-sample poststrata on the basis of analyzing the A.C.E. data themselves was commendable. Further work on poststratification could be conducted with the 2000 data in planning for 2010, and plans for using the 2010 A.C.E. data to refine the poststrata could also be developed. Care should be taken to develop poststrata that do not produce the anomalies that were observed in Revision II from the use of E-sample poststrata for proxy enumerations for which no counterparts were developed on the P-sample side. The use of statistical modeling for developing poststrata from the A.C.E. data should also be part of the poststratification research for 2010. We are confident that these and other improvements could be made in an A.C.E.-type coverage evaluation program for the 2010 census with sufficient attention to research, development, and testing in the next few years. The U.S. General Accounting Office (2003a:1) reported that the Census Bureau “obligated about
OCR for page 265
The 2000 Census: Counting Under Adversity $207 million to [the 2000 census A.C.E. and the predecessor ICM program] from fiscal years 1996 through 2001, which was about 3 percent of the $6.5 billion total estimated cost of the 2000 Census” (see also U.S. General Accounting Office, 2002b). An equivalent expenditure for an A.C.E.-type program in 2010 would be money well spent to ensure that adequate data become available with which to evaluate not only net, but also gross coverage errors. Such errors could be more heavily weighted toward omissions, and not erroneous enumerations, in 2010 compared with the 2000 experience. Recommendation 6.1: The Census Bureau and administration should request, and Congress should provide, funding for the development and implementation of an improved Accuracy and Coverage Evaluation Program for the 2010 census. Such a program is essential to identify census omissions and erroneous enumerations and to provide the basis for adjusting the census counts for coverage errors should that be warranted. The A.C.E. survey in 2010 must be large enough to provide estimates of coverage errors that provide the level of precision targeted for the original (March 2001) A.C.E. estimates for population groups and geographic areas. Areas for improvement that should be pursued include: the estimation of components of gross census error (including types of erroneous enumerations and omissions), as well as net error; the identification of duplicate enumerations in the E-sample and nonresidents in the P-sample by the use of new matching technology; the inclusion of group quarters residents in the A.C.E. universe; improved questionnaire content and interviewing procedures about place of residence; methods to understand and evaluate the effects of census records that are excluded from the A.C.E. matching (IIs);
OCR for page 266
The 2000 Census: Counting Under Adversity a simpler procedure for treating people who moved between Census Day and the A.C.E. interview; the development of poststrata for estimation of net coverage errors, by using census results and statistical modeling as appropriate; and the investigation of possible correlation bias adjustments for additional population groups. 6–E.2 Improved Demographic Analysis for 2010 We support the usefulness of demographic analysis for intercensal population estimation and for helping to identify areas of possible enumeration problems in the census and the A.C.E. For this reason, it is important for the Census Bureau to continue its efforts to obtain additional funding for research and development of demographic analysis methods, particularly for estimates of immigrants, and to develop methods for estimating uncertainty in demographic analysis population estimates. Such developmental work needs to be conducted with other federal statistical agencies that have relevant data and that make use of postcensal population estimates. Recommendation 6.2: The Census Bureau should strengthen its program to improve demographic analysis estimates, in concert with other statistical agencies that use and provide data inputs to the postcensal population estimates. Work should focus especially on improving estimates of net immigration. Attention should also be paid to quantifying and reporting measures of uncertainty for the demographic estimates. 6–E.3 Time for Evaluation and Possible Adjustment The original A.C.E. data collection, matching, estimation, and initial evaluation were carried out according to carefully specified and controlled procedures with commendable timeliness (see Finding 6.1 in Section 6-A.11). However, the experience with the subsequent evaluations and A.C.E. Revision II demonstrates that the
OCR for page 267
The 2000 Census: Counting Under Adversity development of net coverage estimates that are judged to be sufficiently reliable to use in evaluation of the census counts—and possible adjustment—should not be rushed. Similarly, even if the process for evaluating census operations and data items is improved relative to 2000 (see Chapter 9), that process—which is important to verifying the quality of the census content—requires ample time. Consequently, the panel believes that adequate evaluation of the census block-level data for congressional redistricting is not possible by the current deadline of 12 months after Census Day. The Congress should consider changing this deadline to provide additional time for evaluation and delivery of redistricting data. Recommendation 6.3: Congress should consider moving the deadline to provide block-level census data for legislative redistricting to allow more time for evaluation of the completeness of population coverage and quality of the basic demographic items before they are released.
OCR for page 268
The 2000 Census: Counting Under Adversity This page intentionally left blank.
Representative terms from entire chapter: