National Academies Press: OpenBook

Assessing the 2020 Census: Final Report (2023)

Chapter: 4 Coverage Measurement in the 2020 Census

« Previous: 3 Age Heaping as an Indicator of Data Quality
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×

– 4 –

Coverage Measurement in the 2020 Census

Every census since 1790 has missed some people who should have been counted, counted some people twice, and counted other people who should not have been counted (e.g., someone born after Census Day). The balance of undercounted and overcounted people determines whether the census experienced a net undercount or overcount for the nation as a whole. In addition, people can be missed in one place and erroneously counted in another place according to census rules regarding people’s usual residence. Also, people may misreport information such as age, or may fail to report information such as race or ethnicity, for example, and have race or ethnicity erroneously imputed. These kinds of errors will not affect coverage for the nation but will add to net under or overcount in a geographic area or population group.

Differential undercount—that is, the difference in coverage rates between population groups or geographic areas—is more concerning than the national net overcount or undercount. The reason is that many important uses of census data involve allocating shares of a fixed pie—whether that pie represents seats in the U.S. House of Representatives allotted to the various states, or funds allotted to geographic locales by a formula that uses census or census-derived data for the allocation. Indeed, a census that obtained close to the correct number of people, balancing omissions against erroneous enumerations, but had large differences in net undercount rates between population groups and areas could be less equitable than a census that uniformly missed 1–2% of people across all groups and areas.

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×

In this chapter, we describe how the U.S. Census Bureau estimates net undercount rates using Demographic Analysis (DA) and a Post-Enumeration Survey (PES) with dual-system estimation (DSE), and how the PES is used to estimate the contribution of errors such as omissions or duplications (termed gross errors) to the net undercount. We then present DA and PES results for the 2020 Census compared with the 2010 Census. The major finding is that, while the total population in both the 2020 and 2010 Censuses closely matched the DA- and PES-estimated totals, the differential net undercount increased from 2010 to 2020. Net undercounts increased substantially for Hispanic people, Black people, and American Indians on reservations, while net overcounts increased for White and Asian people.1 We discuss problems affecting the 2020 DA and PES estimates and additional research to evaluate the two methods and expand their use to identify potential ways to reduce the differential net undercount in the 2030 Census. The chapter ends with our conclusions and recommendations.

4.1 ESTIMATING COVERAGE ERRORS

There are several methods to estimate coverage errors in the census (see Box 4.1). In recent censuses, the Census Bureau has used DA and PES exclusively. Each method has strengths and weaknesses.

4.1.1 Demographic Analysis

DA is generally considered as close to a gold standard as is possible for estimates of the population as of April 1 of a census year. The current DA program is based on decades of analysis of censuses and administrative records; since the advent of the Medicare program in 1965, it has used separate methods for two age groups. For people born in 1945 or later, when birth records were complete across the nation, the 2020 DA estimates reflect the addition of births over the decades, the subtraction of deaths, and the addition of estimates of net international migration (from various sources, including the American Community Survey [ACS]). For people ages 75 and over (born in 1945 and earlier), the 2020 estimates are based on Medicare enrollment data, adjusted for people not covered. The Medicare enrollment method is more accurate for this group than using birth records given gaps in records for older people. DA estimates are largely independent of the census.2

___________________

1 The 2000 PES also estimated that Black and Hispanic people were undercounted and White and Asian people overcounted (National Research Council, 2004a:Table 6.7, Revision II A.C.E. estimates).

2 See Jensen et al. (2020). Note that censuses from 1970 through 2010 used Medicare enrollment data for people ages 65 and older (Robinson, 2010).

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×

DA estimates are compared with census counts to produce net undercount estimates; the method cannot be used to produce estimates of gross errors, such as omissions or duplications. DA estimates are limited to the nation because estimates of interstate migration are not sufficiently accurate for the preparation of state estimates that can be compared with census results.3 DA estimates are produced for single years of age, sex, and Black and non-Black people (separate estimates for Black Alone and Black Alone or in Combination). Estimates are not produced for other race groups because of gaps in birth- and death-record identification of race. For 2020, experimental DA estimates were produced for Hispanic people ages 0–29 (estimates for ages 0–19 were produced for 2010). Beginning in 2010, the Census Bureau has released a range of DA estimates, to reflect uncertainty in several components of the estimation. For 2020, the Census Bureau released three series—low, middle, and high—which used varied assumptions about births, deaths, net international migration, and Medicare enrollment for people ages 75 and older (see Jensen et al., 2020:Tables 3–8).

The Census Bureau released its 2020 DA estimates in December 2020.4 In March 2022, it released net undercount estimates, comparing the 2020 DA estimates with the 2020 Census population counts, by single years of age and sex.5 As of this writing, the Census Bureau has not yet released net undercount estimates by race (see Section 4.3). Data are now available from the Demographic and Housing Characteristics (DHC) File to estimate net undercount for Hispanic children and young adults (up to age 29). The range of estimates for these groups between the low and high DA series is so great as not to be useful (see Section 4.3.1 for a discussion of the estimates and the need for further research to narrow the range).

4.1.2 Post-Enumeration Survey with Dual System Estimation

The other method used to evaluate coverage in recent censuses is DSE applied to the results of matching the responses from a PES (P-sample) to census enumerations (E-sample) in a sample of blocks (see Table 4.1). The goal is to estimate the true population (the bottom right cell shown with darker shading) using some of the estimates that can be constructed for other (lightly shaded) cells in the table.

___________________

3 The Census Bureau plans to release experimental 2020 estimates of net overcount or undercount of children ages 0–4 for states and counties.

4 See https://www.census.gov/newsroom/press-releases/2020/2020-demographic-analysis-estimates.html.

5 See https://www.census.gov/newsroom/press-releases/2022/2020-census-estimates-of-undercount-and-overcount.html. To permit 2020 Census–DA comparisons by single years of age and sex, when census data were not scheduled to be publicly available until the DHC File was released in May 2023, the Census Bureau prepared a national-level census tabulation for internal use, protected using a differential privacy-based algorithm with a “light touch.”

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×

Table 4.1 Components of the Dual-System Estimator of the True

E-Sample: Correctly Enumerated in Census? P-Sample: Correctly Enumerated in PES?
Yes No Total
Yes N11
Census-PES Matches
N12
In Census, Out of PES
N1+
Correct Census Total
No N21
In PES, Out of Census
N22
Missing from Both
N2+
Missing from Census
Total N+1
Correct PES Total
N+2
Missing from PES
N++
True Population Total

NOTES: PES, Post-Enumeration Survey.

SOURCE: Panel generated.

The DSE estimate of the true total population exploits the assumed (statistical) independence of the census results and the PES results. Based on this assumed independence, the ratio of the number of census-PES matches (N11) to the correct census total (N1+, which is the percentage of people captured by the PES conditional on their capture in the census) should equal the ratio of the correct PES total (N+1) to the unknown true total population (N++, which is the unconditional percentage of people captured by the PES). The resulting expression can be rearranged into Equation 4.1, and the resulting DSE Total estimate can then be used to calculate the net coverage rate, as in Equation 4.2:

True ( DSE ) Total ( N + + ) = N 1 + × N + 1 N 11 (4.1)
Net Coverage Rate = 100 × Census Total DSE Total DSE Total (4.2)

To support this assumption of independence, the Census Bureau keeps census and PES processes operationally separate. In particular, the Master Address File is used as the frame for the census but not the PES, which instead fields a separate Independent Listing operation to construct its address list in sample blocks. Further, the Census Bureau makes extensive use of post-stratification to account for higher or lower rates of capture, in both the census and PES, by factors such as housing tenure (owner or renter) and the race and ethnicity characteristics of the residents. More recently, logistic regression has also been used to essentially condition the above DSE calculation on the values of these factor variables, resulting in more homogeneous subsets of the population with respect to their capture probabilities.

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×

In addition to the independence assumption, these equations fail to take into consideration various complications that stem from incomplete PES and census data, duplicate enumerations, and erroneous enumerations. Further, the methodology is based on various additional assumptions, especially that the matching of the data from the census and the PES needs to be carried out with few errors. One aspect of incomplete data is that not all census enumerations can have their match status to the PES ascertained because they are not “data-defined,” meaning that they did not have sufficient information to be accepted for census processing and so had to be wholly imputed. Given various corrections to deal with duplicates, erroneous enumerations, and imputations, the Census Bureau estimates Coverage Correction Factors (CCFs) as a modification of the DSE estimators. The quality of the CCFs is therefore impacted by errors in the estimation of the number of duplicates, errors in the estimation of the number of erroneous enumerations, errors in the treatment of incomplete data, and the impact of false non-matches and false matches. CCF quality therefore depends on the frequency of such errors, which needs to be assessed to judge the success of the estimates. The resulting complexity of PES data collection and estimation is summarized in Box 4.2, which provides a simplified overview of the major steps.

From the 2020 PES, the Census Bureau released national estimates of net and gross coverage errors in March and May 2022 for people in households by age and sex groups, race and ethnicity, household relationship, housing tenure (i.e., own, rent), and geography (i.e., 4 regions, 50 states, and District of Columbia).6 The Census Bureau also provided estimates of components of error for various census operations. The 2020 PES estimates generally have 60–80% larger standard errors than in 2010 because the sample size was smaller to start with and the COVID-19 pandemic and other factors led to increased household nonresponse (see Section 4.3).

4.2 2020 COVERAGE RESULTS

4.2.1 Total Population

Table 4.2 provides the total population net undercount rates for the 2020 and 2010 Censuses for the low, middle, and high series of DA estimates. In each year, the census counts for the total population are on target with the DA middle

___________________

6 See the Census Bureau’s 2020 PES press kit at https://www.census.gov/newsroom/press-kits/2021/post-enumeration-survey.html. Unlike in 2010 (see Davis and Mulligan, 2012:Attachments 1, 2), the Census Bureau did not release coverage error estimates for large counties or places in 2020, determining that the uncertainty due to modeling error for substate estimates did not warrant their production. It did release national estimates of net and gross coverage errors by several characteristics for housing units, and also released household population and housing unit coverage estimates for Puerto Rico in August 2022, which are not further discussed.

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×

Table 4.2 Percentage Net Coverage Error for the Total Resident Population: 2010 and 2020 Demographic Analysis and Post-Enumeration Survey Estimates

Year/Series Demographic Analysis Post-Enumeration Survey
Low Series Middle Series High Series
2010 1.00 0.13 −1.27 0.01
2020 0.22 −0.35 −1.21 −0.24

NOTES: Positive numbers are net overcounts; negative numbers (in red) are net undercounts (the DA or PES estimates are the denominators); DA estimates are for the resident population; PES estimates are for the household population, excluding group quarters and remote Alaska. PES estimates are not statistically significantly different from zero at the 10% level in either the 2010 or 2020 Censuses.

SOURCE: U.S. Census Bureau (2022c).

series estimates within plus or minus 1%. The table also provides estimates of total population net undercount rates for 2020 and 2010 from the PES.

4.2.2 Age and Sex

Figure 4.1 compares 2020 net undercount estimates from the DA middle series for single years of age by sex. Age heaping—that is, net overcounts of ages ending in 0 and 5 and net undercounts of surrounding ages (see Chapter 3)—is evident for both men and women; generally, adult men have higher net undercount rates and lower net overcount rates than adult women—a pattern found in previous censuses. Figure 4.1 also shows substantial net undercounts of children and overcounts of older people, which characterized prior censuses as well. In addition, the 2020 Census substantially overcounted college-age people compared with a modest net overcount of this age group in 2010 (see Figure 3.1).

Figure 4.2 shows net coverage estimates from the 2010 and 2020 PES and the 2020 DA middle series by age and sex groups, with the DA estimates generally considered to be more accurate in the absence of contrary evidence. The three sets of estimates are similar except that:

  • The 2020 PES estimated a larger net undercount of young children (ages 0–4) than in 2010, and the 2020 DA estimated an even larger net undercount;
  • The 2020 DA estimated a larger net undercount for children ages 5–9 than either the 2010 or 2020 PES; and
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Image
Figure 4.1Demographic Analysis estimates of percentage net undercount or overcount, by single years of age and sex, 2020, middle series.

SOURCE: Middle series from Table 1—Demographic Analysis Net Coverage Error Estimates by Single Year of Age, Sex, and Series: April 1, 2020 at https://www.census.gov/data/tables/2020/demo/popest/2020-demographic-analysis-tables.html.

  • The 2020 DA estimated a net overcount of women ages 18–29 and a negligible overcount of men ages 18–29, while the 2010 and 2020 PES generally estimated net undercounts for these groups. The reason for the discrepant findings for people ages 18–29 is likely because the PES, which excluded college/university student housing along with other group quarters, could not capture the significant problems in the 2020 Census enumeration of college housing (see Chapter 9).

4.2.3 Race, Ethnicity, Housing Tenure, Age and Sex Groups, States

The PES produces estimates for more attributes than does DA, including housing tenure, detailed race and ethnicity, and states, although it produces less detail for age and sex. By housing tenure,7 the PES estimated that the census had a net undercount of −1.48 percent for renters compared with a net overcount of 0.43 percent for owners in 2020—a similar pattern and magnitude to 2010.

___________________

7 The 2020 PES coverage estimates by housing tenure are statistically different from zero at the 10% level. The PES does not report coverage estimates by household relationship because that variable may be recorded differently between the census and the PES for reasons such as the household having a different reference person completing the two interviews. See Khubba et al. (2022:18).

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Image
Figure 4.2Post-Enumeration Survey (2010, 2020) and Demographic Analysis (2020, middle series) estimates of percentage net undercount or overcount, by age and sex groups.

NOTES: DA, Demographic Analysis, covers the resident population; PES, Post-Enumeration Survey, covers the household population excluding group quarters and remote Alaska. All PES estimates are statistically significantly different from zero at the 10% level except estimates for people ages 5–9 (2010, 2020), 10–17 (2020), 18–29 female (2010), and 30–49 female (2020).
SOURCE: Khubba et al. (2022:Table 10).

Figure 4.3 shows estimates of net coverage error by race and ethnicity from the PES for 2010 and 2020. Net overcounts were higher in 2020 compared with 2010 for non-Hispanic White Alone people and substantially higher for Asian people. Net undercounts were higher in 2020 for Black people and American Indian or Alaska Native (AIAN) people on reservations and substantially higher for Hispanic people. Table 4.3 shows how the net differential undercount widened from 2010 to 2020 for Black, AIAN, and Hispanic people versus non-Hispanic White people.

Figure 4.4 shows previously unpublished estimates of net coverage rates from the 2010 and 2020 PES for race and ethnic groups cross-classified by housing tenure (owner/renter) and age/sex groupings. The graphs are for non-Hispanic White Alone people (4.4(a)), Asian people (4.4(b)), Black people (4.4(c)), and Hispanic people (4.4(d)). AIAN people on reservations are not shown because of large standard errors.

Findings that stand out from these detailed cross-classifications include:

  • Non-Hispanic White Alone people were almost always overcounted in both censuses, although the percentages are not large except for men and
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Image
Figure 4.3Post-Enumeration Survey (2010, 2020) estimates of percentage net undercount or overcount, by race and ethnicity.

NOTES: AIAN, American Indian or Alaska Native; AOIC, Alone or in Combination; PES, Post-Enumeration Survey. Excludes group quarters and Remote Alaska. PES estimates for Native Hawaiian or Other Pacific Islanders and AIAN AOIC in other American Indian areas and living elsewhere not shown because of large standard errors. Estimates for Some Other Race AOIC not shown because they are overwhelmingly Hispanic. All estimates are statistically significantly different from zero at the 10% level except the 2010 estimate for Asian Alone or in Combination (0% net undercount).

SOURCE: Khubba et al. (2022:Table 4).

  • women ages 50 and over in rental housing, who were overcounted in 2020 by double the rate of 2010.
  • Asian people were predominantly overcounted in both censuses but especially so if they lived in rental housing in 2020.
  • Black people were generally undercounted in both censuses and particularly so if they lived in rental housing; net undercount rates were somewhat higher for this group in 2020 compared with 2010.
  • Hispanic people under age 50 were generally undercounted in both censuses with much higher rates in 2020 than in 2010 and if they lived in rental housing.

Finally, the 2020 PES estimated that 14 states had statistically significant estimates of net over- or undercount, compared with 0 states in 2010. The American Redistricting Project (2022) used the PES estimates for those 14 states to simulate how reapportionment would have turned out in 2020 if the census counts for those states were adjusted for net over- or undercount. It estimated that, compared with the actual reapportionment, California, Illinois, Michigan, New York, Ohio, Pennsylvania, and West Virginia would each still have lost

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×

Table 4.3 Net Coverage Rates and Differential Net Undercount Rates Comparing White Non-Hispanic People to Specified Groups, 2010 and 2020 Post-Enumeration Survey

Year/Measure/Group White Non-Hispanic Alone Black Alone or in Combination AIAN Alone or in Combination on Reservation Hispanic
Net Coverage Rate
2010 PES 0.83 −2.06 −4.88 −1.54
2020 PES 1.64 −3.30 −5.64 −4.99
Differential Net Undercount (White Non-Hispanic Alone minus specified group)
2010 PES 2.89 5.71 2.37
2020 PES 4.94 7.28 6.63

NOTES: AIAN, American Indian or Alaska Native; PES, Post-Enumeration Survey. All net coverage rate estimates are statistically significantly different from zero at the 10% level.

SOURCE: Khubba et al. (2022:Table 4).

a seat; in addition, Minnesota and Rhode Island would each have lost a seat, and Colorado would not have gained a seat. Montana, North Carolina, and Oregon would each still have gained a seat, but Florida and Texas would each have gained three seats instead of one and two, respectively.

These results should be cited with caution. The method the Census Bureau used to estimate state coverage errors was a “synthetic” method—that is, the Census Bureau did not estimate coverage errors for each state directly but rather as outputs of a national-level logistic regression model that included state as one of the variables in the model (see Heim, 2022:Section 7.2, indicating that the inclusion of a state variable was an improvement over the 2010 PES). Without further evaluation, it is not possible to account for some of the apparently inconsistent results. For example, Florida and Texas both have large Hispanic populations (26% and 39%, respectively), which were severely undercounted nationally as well as in both states. California and New York also have large Hispanic populations (39% and 19%, respectively), yet California was estimated not to have a statistically significant net over- or undercount, and New York was estimated to have a substantial net overcount.

4.2.4 Post-Enumeration Survey—Components of Error

In addition to net coverage error, the PES is used to estimate what are termed gross coverage errors or “components of error.” Table 4.4 shows estimates of

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Figure 4.4Post-Enumeration Survey (2010, 2020) estimates of percentage net undercount or overcount for demographic groups.
Image

components of coverage error by race and ethnicity from the PES for 2010 and 2020. On the census side for 2020 (columns 2, 4, 6, and 8), 94.4% of enumerations were correct, 1.6% were duplicates, 0.6% were other types of erroneous enumerations (e.g., counting a baby born after Census Day), and 3.4% were whole-person imputations. On the PES side for 2020 (columns 10 and 12), 94.2% of people in the P-sample matched a census enumeration in the E-sample, while 5.8% of people in the PES were missed in the census. The 2010 PES shows similar results, with a higher rate of duplicates (column 3 relative to column 4) and somewhat lower rates of whole-person imputations (column 7 vs. column 8) and omissions (column 11 vs. column 12).

Contributing to the net overcount of Asian people in 2020 (see Figure 4.3) was a lower rate of PES people omitted from the census for this group compared with 2010. Conversely, the net undercount rate for Black people increased in 2020 due to an increase in omitted people. The net undercount rate for Hispanic

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Image

NOTE: Estimates that are very close to zero have values that were set to ±0.25 because they were not statistically significant at the 10% level (the sign of the original estimate was kept).

SOURCES: 2020: Jost and Khubba (2022:Table D). 2010: Keller and Fox (2013:Table D).

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×

Table 4.4 Percentages of the Household Population by Components of Coverage Error by Race and Ethnicity, 2010 and 2020 Census PES

Year Rows sum to 100% of enumerated household population in census (excluding group quarters and Remote Alaska) Rows sum to 100% of household population estimated by PES/DSE
Census Correct Enumerations Census Duplicates Census Other Erroneous Enumerations Census Whole-Person Imputations PES-Matched Enumerations PES People Omitted from Census
2010
(1)
2020
(2)
2010
(3)
2020
(4)
2010
(5)
2020
(6)
2010
(7)
2020
(8)
2010
(9)
2020
(10)
2010
(11)
2020
(12)
Total 95.2 94.4 2.7 1.6 0.4 0.6 1.7 3.4 94.7 94.2 5.3 5.8
Race
White Non-Hispanic Alone 95.4 94.9 2.6 1.5 0.4 0.6 1.6 2.9 96.2 96.5 3.8 3.5
Black AOIC 92.6 92.9 3.6 1.9 0.7 0.7 3.1 4.5 90.7 89.8 9.3 10.2
Asian AOIC 94.7 94.0 2.4 1.8 0.9 0.7 2.1 3.5 94.7 96.5 5.3 3.5
AIAN AOIC on Reservations 90.8 91.7 4.7 4.6 0.4 0.5 4.1 3.2 86.3 86.5 13.7 13.5
Hispanic 93.7 94.2 3.2 1.4 0.7 0.6 2.4 3.8 92.3 89.5 7.7 10.5

NOTES: AIAN, American Indian or Alaska Native; AOIC, Alone or in Combination; DSE, dual-system estimation; PES, Post-Enumeration Survey. Excludes group quarters and Remote Alaska. All estimates are statistically significantly different from zero at the 10% level.

SOURCES: 2010: Mule (2012:Table 9). 2020: Khubba et al. (2022:Table 5).

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×

people increased markedly in 2020 due to a marked increase in omitted people, from 7.7% in 2010 to 10.5% in 2020.

Table 4.5 shows estimates of components of coverage error for census operations; it is a direct repeat of Table 3.4 and reprinted here for ease of reference. Clearly, self-responses via paper, the internet, or telephone were the highest-quality enumerations in terms of having the highest percentages correct in the census and among the lowest percentages of duplications, other erroneous enumerations, and whole-person imputations. This association only applies, however, to ID self-responses (i.e., those using the identification code corresponding to a particular address that was provided in the mailings asking households to respond). Non-ID self-responses (9.7% of the total) had a higher-than-average rate of whole-person imputations and a lower-than-average rate of correct census enumerations. Update Enumerate and Update Leave responses were about on par with non-ID self-responses. Administrative records enumerations and Nonresponse Followup (NRFU) enumerations conducted with a household head fell between ID self-responses and Non-ID self-responses in quality. The worst-quality enumerations, with the lowest percentages of correct census enumerations and highest percentages of whole-person imputations, were those obtained in NRFU by either a proxy (nonhousehold member such as a neighbor or landlord) or a household member who was not the head.

While comparisons of 2020 to 2010 would be informative, it is difficult to find categories for which comparable PES estimates were produced for both censuses. Reasonably comparable sets of estimates for 2020 and 2010 pertain to the time when NRFU enumerations were completed. In 2010, there is a definite degradation of quality from early NRFU enumerations to later enumerations. This pattern is not evident for 2020.8 In both censuses, NRFU enumerations with a household head (2010) or a household member (2020) were of higher quality than NRFU enumerations with a proxy such as a neighbor or landlord. In particular, NRFU household enumerations were correct 93% of the time in 2010 and 94% of the time in 2020, whereas NRFU proxy enumerations were correct only 70% of the time in 2010 and 87% of the time in 2020.9

Confirming that self-response is higher quality than other response modes, Figure 4.5 shows rates of census erroneous enumerations, whole-person imputations, and PES people omitted from the census for deciles of census tract self-response rates for 2020.10 The 2020 Census exhibits the same phenomenon observed in 2000—that quality deteriorates the lower the self-response rate,

___________________

8 For 2010, see Mule (2012:Table 19); for 2020, see Hill et al. (2022:Appendix Table 8).

9 For 2010, see Mule (2012:Table 21); for 2020, see Hill et al. (2022:Appendix Table 9).

10 This analysis can include PES people omitted from the census in addition to erroneous census enumerations and whole-person imputations because the variable being examined is the census tract self-response rate.

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×

Table 4.5 Percentage of the Household Population by Components of Coverage Error by Census Operation, 2020 Census Post-Enumeration Survey

Coverage Error Component/Census Operation % Operation of Enumerations in Households Four columns add across to 100% of each operation
Correct Enumerations Duplicate Enumerations Other Erroneous Enumerations Whole-Person Imputations
Census Total 100.0 94.4 1.6 0.6 3.4
Self-Response
Internet and Telephone ID Response 55.6 96.9 0.9 0.4 3.4
Internet and Telephone Non-ID Response 9.4 91.6 2.3 0.6 5.6
Paper-based Response 11.6 97.1 1.3 0.6 1.6
Nonresponse Followup (NRFU)
NRFU Head of Household 9.7 94.0 3.3 0.8 1.8
NRFU Other Household Member 2.9 76.6 2.2 0.5 22.7
NRFU Proxy 4.7 87.4 3.7 1.8 7.0
Administrative Records 3.1 94.5 4.3 1.3
Update Leave/Update Enumerate 2.0 91.7 4.0 0.6 3.6
Count Imputations 0.6 100.0

NOTES: Table repeats Table 3.4 in its entirety and is reprinted here as aid to the reader. The estimates in this table are drawn from various tables in the source document; the categories are not necessarily exhaustive or mutually exclusive. In particular, the Update Leave/Update Enumerate category estimates represent those types of enumeration areas—the Update Leave responses consequently overlap with internet, paper, and NRFU responses. All estimates are statistically significantly different from zero at the 10% level. (Whole-person imputations are a census count and do not have associated sampling error.)

SOURCE: Hill et al. (2022:Appendix Tables 7, 9, 10).

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Image
Figure 4.5Percentages of erroneous census enumerations, whole-person imputations, and Post-Enumeration Survey people omitted from the census for the household population by self-response rate deciles for census tracts, 2020 Post-Enumeration Survey.

NOTES: Each decile contains 10% of the nation’s census tracts, ordered by self-response rate from lowest (1st decile) to highest (10th decile). Erroneous enumerations combine duplicate and other erroneous census enumerations; Update Enumerate areas, by design, are enumerated in person and hence fall into the first (lowest) decile of census tract self-response rates. All estimates are statistically significantly different from zero at the 10% level. (Whole-person imputations are a census count and do not have associated sampling error.)

SOURCE: Hill et al. (2022:Table 6).

particularly in terms of omissions of PES people from the census.11 This phenomenon means that population groups and areas that have lower self-response, which includes census tracts with predominantly Black or Hispanic residents, are at greater risk of net undercount compared with other groups (see Chapter 6).

4.3 CAUTIONS AND AREAS FOR RESEARCH

4.3.1 Uncertainty in Demographic Analysis

Although traditionally viewed as close to a gold standard for estimating census coverage error, DA estimates are not error free. There is sampling error for components derived from the ACS (estimates of foreign-born immigrants

___________________

11 For 2020, see Jost and Khubba (2022:Table D); for 2000, see National Research Council (2004a:Table D.2). A comparable analysis was not produced from the 2010 PES.

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Image
Figure 4.6Demographic Analysis estimates of percentage net undercount or overcount, by single years of age, 2020, low, middle, high series.

SOURCE: U.S. Census Bureau (2022d:Table 1).

and adjustment of Medicare enrollment for people ages 75 and over in the middle and high series); nonsampling error in birth records prior to 1980 because of registration incompleteness; and error in the estimation process. To endeavor to bound the error in DA estimates, beginning in 2010 the Census Bureau carried out a sensitivity analysis of the DA estimates by constructing lower and higher series around the middle series. In 2010, five series were originally produced, and the middle series was subsequently revised; in 2020, three series were produced. Each series varies on a few key parameters.

For 2020, the components that varied among the low, middle, and high scenarios were births (from 288.4 million in the low series to 289.4 million in the high series); international migration (from 43.2 million to 45.6 million); and Medicare-based estimates (from 21.5 million to 23 million). Death estimates did not vary.12 Figure 4.6 shows net overcount and undercount estimates from the three series for 2020 by single years of age. The series vary by less than a percentage point through age 19, after which there is more divergence. Around age 75, the high series diverges substantially from the middle series (by about five percentage points on average), which, in turn, diverges from the low series (by about two percentage points on average). In fact, the high series at these older ages produces as many net undercounts as overcounts, whereas the middle

___________________

12 Jensen et al. (2020:Table 3) (see also Table 2, which describes all components of the DA estimates, and Table 8, which describes the assumptions used for the various components for the low, middle, and high series).

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×

and low series are always in net overcount territory. The Medicare component is responsible—the difference of about 1.5 million between the low and high series, which affects only people ages 75 and over, is almost the same as the difference in the DA estimates for this group.13

For 2020, for DA estimates for people ages 75 and older, the Medicare component equals actual Medicare enrollment in the low series. For the middle series, Medicare enrollment was adjusted upward using unenrolled estimates from 5-year ACS data for 2015–2019 (estimating that 1.7% of people ages 75 and over were not enrolled). For the high series, the middle series-adjusted Medicare estimates were inflated by an arbitrary 5% (estimating that 6.8% of people ages 75 and over were not enrolled). In 2010, Medicare data were used for the entire population ages 65 and older. The adjustment factors, taken from the Current Population Survey, were smaller than the ACS adjustment factors in 2020, and the differences between the high and low series affected ages between 65–70 but not higher ages.14

Figure 4.7 shows net overcount and undercount estimates from the three DA series for 2020 for Hispanic and non-Hispanic people for four age groups: 0–4, 5–9, 10–17, and 18–29. The DA high and low series for Hispanic children ages 0–4 differ by 14 percentage points (from a slight net overcount to a 14% net undercount), compared to a much narrower range for non-Hispanic children ages 0–4 of 5 percentage points (from a net undercount of 7% to a net undercount of 2%). Similarly, the ranges for older Hispanic children and young adults are 14 percentage points (ages 5–9), 13 percentage points (ages 10–17), and 12 percentage points (ages 18–29), compared with 4 percentage points (ages 5–9 and 10–17) and 2 percentage points (ages 18–29) for older non-Hispanic children and young adults. The DA low series coverage estimates imply unrealistically that coverage for Hispanic children ages 0–4 was on target while young non-Hispanic children had a large net undercount. Conversely, the low series estimates imply that Hispanic young adults ages 18–29 had a large net overcount while coverage of non-Hispanic young adults was on target. These results conflict with what is known about coverage errors for Hispanic and non-Hispanic people from the PES (see Figure 4.4, particularly panel (d)).

Examining the assumptions built into the 2020 high and low DA series for Hispanic children and young adults, it appears that differences in how Hispanic births were determined are a primary reason for the differences in estimates, especially at younger ages. For the low series, Hispanic births were assigned by the ethnicity of the mother, while for the high series, babies were classified as Hispanic if either parent was reported to be Hispanic, resulting in a larger number of Hispanic births (see Jensen et al., 2020:Tables 5–8).15 The results

___________________

13 Compare Jensen et al. (2020:Table 3) with U.S. Census Bureau (2022d:Table 1).

14 Devine et al. (2012).

15 The Census Bureau used a process that links census records for children and parents—KidLink—to try to assign race and ethnicity as they would be reported in a census or survey, instead

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Image
Figure 4.7Demographic Analysis estimates of percentage net undercount or overcount, by ethnicity, children and young adults, 2020, low, middle, high series.

SOURCE: Calculations by the panel from tables available at data.census.gov for the 2020 Census: Table PCT12 (total), PCT12H (Hispanic); and in U.S. Census Bureau (2022d:Table 3 (total), 1C (Hispanic)) for the DA low, middle, and high series. Note that the non-Hispanic estimates are the result of subtraction of the Hispanic estimates from the total estimates.

are the opposite for non-Hispanic births because they were calculated as a residual. Consequently, the low DA series shows more complete coverage—and even overcounts—of Hispanic compared with non-Hispanic children and young adults and vice versa for the high series. Further research on these large differences is clearly needed.

There are also differences in estimates of net international migration comparing the low to high DA series for Hispanic children and young adults, although they are much smaller than the differences in estimates of births (see Jensen et al., 2020:Table 5). Although Jensen et al. (2020) do not break out the contributions of births and net international migration for Hispanic people by age, it is probable that assumptions about net international migration play a bigger role in differences between the low to high DA series for older age groups. The Census Bureau plans to release an experimental set of DA estimates

___________________

of being assigned by other people (e.g., hospital personnel) as on birth records, to classify births for the middle DA series (see Jensen and Kennel, 2022:4–6).

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×

for Hispanic people ages 30–39, and it will be important to assess the role of various assumptions for net international migration in these estimates and to conduct research to try to narrow the range of estimates.

4.3.2 Delays in Demographic Analysis Net Undercount Estimates by Race

No comparisons of DA estimates for Black and non-Black people to the 2020 Census results have been released to date. To permit race comparisons, census race data must be modified—specifically, people who checked “Some Other Race” (SOR) and neither checked another category nor wrote in any specific race must have a specific race imputed for them because the DA estimates derive from vital records and other sources that historically have not always had an SOR category. The Census Bureau therefore creates a “Modified Race File” to support DA as well as its annual population estimates for the nation, states, and counties. Compared with 2010, the Census Bureau experienced delays in creating a 2020 Modified Race File because it had not developed appropriate confidentiality-protection routines consistent with its new approach to information disclosure avoidance.16 The Census Bureau hoped to release the 2020 file in summer 202317 but, as of this writing, has not produced a file with acceptable accuracy18 (see Chapter 10 for more information on the Modified Race File and Chapter 11 for information on confidentiality protection in the 2020 Census).

For the Modified Race File, the Census Bureau distributes the SOR Alone category of responses to specific race groups by age and sex. The SOR Alone reassignment method essentially searches for a donor—someone in the household or neighboring area who reported a specific race(s) and Hispanic origin (yes/no) is chosen to represent the person who checked SOR, and the race(s) of the donor are assigned to that person.19 For people classified as SOR and one or more specific races (i.e., Two or More Races), the Census Bureau assigns a specific race category and removes the person from the Two or More Races category.20

___________________

16 See Modified Race Data 2010 at https://www.census.gov/data/datasets/2010/demo/popest/modified-race-data-2010.html, released July 2012.

17 Private communication from Eric Jensen, U.S. Census Bureau, April 7, 2023. The Census Bureau did not, in any case, want to release the 2020 Modified Race File prior to the May 2023 release of the Demographic and Housing Characteristics file, which contains detailed age, sex, race (including SOR), and ethnicity data from the 2020 Census (see Chapter 11).

18 See slide 23 in “Briefing on the Base Evaluation and Research Team,” available at https://www2.census.gov/about/partners/cac/sac/meetings/2023-09/presentation-briefing-base-evaluation-research-term.pdf.

19 See U.S. Census Bureau, Population Division (2012) for the 2010 methodology, which is being used for 2020.

20 In 2020, reassigning people categorized as SOR and a specific race to the specific race will likely create a larger White population, given the new write-in space for the White checkbox and

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×

4.3.3 Lack of Subnational Demographic Analysis Estimates

Historically, Demographic Analysis has been unable to provide subnational estimates because of the difficulties of estimating not only net international immigration but also internal migration. The Census Bureau plans to produce subnational coverage estimates for children ages 0–4 on an experimental basis, as that is the age group for which estimates are dominated by births and migration should be least problematic. It could be worthwhile to develop subnational estimates for children ages 5–9 as well. The need for subnational estimates is clear from the 2020 PES results, which estimated higher net overcounts and net undercounts than in 2010 for various population groups, which are not distributed equally across the nation.

4.3.4 Uncertainty in Post-Enumeration Survey Results

The previously described complexity of PES data collection and estimation means that there are multiple sources of error in the estimates of each component, such as matched E-sample and P-sample responses, and even more so in estimates of net coverage. Sources of error include:

  • Sampling error from sampling census blocks (and subsampling housing units in large blocks) for the P-sample;
  • Nonresponse error to the extent that nonrespondents to the P-sample differed from respondents;
  • Coverage error in the P-sample, which could miss people and duplicate or otherwise erroneously include people;
  • Imputation error from imputing values for missing characteristics for P-sample respondents (imputation error also affects the E-sample);
  • Recall error by P-sample respondents as to who lived in the household on Census Day and where they themselves were living on Census Day, April 1 (household members at the time of the P-sample interview could have lived at the P-sample household on Census Day or have moved into the P-sample household from some other household, and some household members on Census Day could have moved out before the P-sample interview);
  • Error in estimating match status and enumeration status, including for “unresolved” cases, when even follow-up in the field could not obtain sufficient information to permit an unambiguous classification; and
  • Failure of the independence assumption.

Marra and Kennel (2022) provide a comprehensive discussion of potential errors in the 2020 PES but present a method to quantify only the error due to

___________________

the coding that assigned people who only checked White but wrote in an Hispanic ethnicity (or other specific race) to White and SOR (or White and other specific race) (see Chapter 10).

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×

sampling. This means that net coverage rates for, say, race and ethnic groups that are deemed to be statistically different from zero at the 90% confidence level may in fact not be distinguishable from zero were other sources of error taken into account. To date, the Census Bureau has not conducted sensitivity analyses for the PES to attempt to bound errors from the matching process or other sources of error, and no such analyses were conducted for the 2010 PES. The 2000 PES program did include sensitivity analyses, which helped identify components of the data-collection and estimation process that, if somewhat different assumptions were made, could have had significant effects on estimates of net undercount rates. The 2000 PES also included evaluations of several processes (e.g., matching) that included field and clerical work.21

4.3.5 Comparison of Errors in the 2010 and 2020 PES

The 2020 PES was conducted under difficult circumstances. The COVID-19 pandemic delayed NRFU in the census and therefore delayed fielding the P-sample interview (the PES independent address listing was completed before the nationwide pandemic-related shutdown in March 2020). The 2020 PES also experienced high levels of nonresponse, likely due to a combination of factors such as fear of contact during the shutdown and heightened distrust of the government. In fact, the Census Bureau extended the P-sample interviewing for an additional period to bring up response rates.

Table 4.6 compares the 2020 and 2010 PES field schedules for the P-sample, while Table 4.7 compares features of the 2010 and 2020 PES that could introduce error. While not explicitly called out in Table 4.7, the delayed schedule for the P-sample interviewing in 2020 (Table 4.6) likely impaired the accuracy of responses about Census Day residence because of the longer recall period. Overall, it is evident that the quality of the 2020 PES was less than that of the 2010 PES in many respects, underscoring the importance of additional analysis of 2020 PES quality, including sensitivity analysis.

4.3.6 Limited Analysis of Components of Coverage Error

A limitation of the 2020 PES program thus far is that the assessments of components of error for census operations are not always comparable with the assessments of 2010, even when comparability could be achieved (see preceding

___________________

21 See National Research Council (2004a:Ch. 6). The original 2010 PES coverage estimation results, which estimated a net undercount, were inconsistent with the DA results, which estimated a slight net overcount. This finding and concerns about some aspects of the 2010 PES led to additional evaluations involving original data collection and analysis. The results of this work demonstrated that the 2010 PES classified many duplicate enumerations in the E-sample as correct and missed in the P-sample. Reanalysis led to a lower census correct enumeration rate, a higher match rate, and a lower estimate of net undercount. See also Mulry and Spencer (1991) for the development of a total error model for the 1990 Census based on the 1988 test census.

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×

Table 4.6 Schedule of Person Interviewing Field Operations, 2010 and 2020 Post-Enumeration Surveys

Milestone 2010 PES 2020 PES

P-Sample Person Interviewing

8/15/10–10/15/10 9/23/20–12/18/20

P-Sample Second Round of Person Interviewing (to boost response rates)

— (not needed) 2/11/21 – 3/20/21

Field Follow-up of Unresolved Person Cases

1/2011–3/2011 6/14/21–9/17/21

NOTES: PES, Post-Enumeration Survey.

SOURCE: 2010: Whitford (2008). 2020: Marra and Kennel (2022:Table 1).

section). It would be useful, for example, to estimate, for 2010, the components of error for NRFU interviews with a household member other than the head (as in 2020) and for 2020 to estimate the components of error separately for Update/Enumerate and Update/Leave operations (as in 2010).22 Further, for 2020, it would be useful to produce estimates of components of error for exhaustive, mutually exclusive categories of enumerations (e.g., self-responses versus NRFU responses in Update Leave areas).

It would also be useful to conduct analyses like the one included in National Research Council (2004a:Table D.1). That analysis examined not only duplicates and other erroneous enumerations in the 2000 E-sample, but also omissions of P-sample people from E-sample households, for E-sample households classified by owner/renter and mail/enumerator return status. This type of analysis is limited to households with at least one member who matches between the E-sample and P-sample but could be illuminating nonetheless, particularly if conducted for additional types of census operations (e.g., proxy or administrative records NRFU responses). Given that increases in omissions appeared to drive higher net undercount rates in 2020 for groups such as Hispanic people, an analysis of E-sample and P-sample households with at least one common member seems imperative to understand sources of and possible ways to reduce undercount in 2030.

Additionally, E-sample and P-sample households could be partitioned into two groups—those addresses, households, or individuals for which attempts at enumeration resulted in a census component error (e.g., an omission) and those that did not. Analyzing the two groups using discriminant analysis or

___________________

22 In the 2010 Census, the Census Bureau used the slash (/) character in characterizing these two operations, to distinguish the address verification component of the operation (“Update”) from the questionnaire/interview component (“Leave” or “Enumerate”), but no slash is used in the 2020 Census operation names.

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×

Table 4.7 2010 and 2020 Post-Enumeration Survey Features Relevant for Error

Feature 2010 PES 2020 PES

1—P-Sample Size (addresses)

171,000 161,000

2—Interviews

140,000 114,000

3—Noninterviews

5,300 23,000

4—Vacant or Nonexistent

26,800 24,200

5—Noninterview Rate (row 3/(row 2 + row 3))

3.6% 16.8%

6—P-Sample Size (listed people)

393,000 345,000

7—P-Sample People with Insufficient Information for DSE

13,000 51,000

8—P-Sample People after Whole Households with Insufficient Information Treated as Noninterviews

383,000 301,000

9—Remaining P-Sample People with Insufficient Information (requires imputation of P-sample inclusion and match statuses)

6,400 12,500

10—P-Sample Insufficient Information Rate (row 9/row 8)

1.7% 4.2%

11—E-Sample People

384,000 397,000

12—E-Sample People with Insufficient Information (requires imputation of enumeration and match statuses)

13,000 40,000

13—E-Sample Insufficient Information Rate (row 12/row 11)

3.4% 10.1%

14—P-Sample People with at Least One Characteristic Imputed (%)

6.6% 15.7%

15—E-Sample People with at Least One Characteristic Imputed (%)

15.0% 16.8%

16—Unresolved Match Status Rate (P-Sample)

1.9% 5.6%

17—Unresolved Enumeration Status Rate (E-Sample)

4.8% 11.6%

NOTES: PES, Post-Enumeration Suvey; DSE, dual-system estimation.

SOURCE: Beaghen et al. (2022:Tables 1, 2, 3, 8, 11); Phan and Lawrence (2022:Table 7).

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×

classification trees with demographic, neighborhood, and other variables as predictors could help assess the quality of each census process in enumerating the population. In turn, the results could provide a basis for considering specific processes for future improvements. See Appendix D.1 for an outline of such an approach.

Having made this suggestion, it must be pointed out that such regression analyses or discriminant analyses, and many of the analyses in this report, compare groups of households that potentially differ for reasons other than their mode of census response or degree of PES error. Basing decisions on such analyses risks assuming correlations and related statistics are causal inferences, when they are associative.

4.3.7 Limited Granularity of Coverage Estimates

Finally, the inability to provide granular geographic or population group coverage estimates is a drawback of the PES (although the PES provides considerably more information than DA in this regard). At the subnational level, direct coverage estimates from the PES have high variability—even if they are unbiased, the variability is too great for usable assessments of population group coverage by state or coverage for substate areas, for example. Careful regression or machine learning models, however, could produce focused coverage estimates with sufficiently low variance and bias (i.e., low mean square error) to support such assessments. For example, estimated coverage for Hispanic people could combine information to adjust for urban/rural, region, and other relevant factors. Models would need to be weighted by estimated variances and possibly by survey weights. The proposed approach would be similar to that used for decades in the Census Bureau’s Small Area Income and Poverty Estimates program.

Another approach that could shed light on reasons why some groups are missed in the census more than others and other groups double counted is to conduct a large matching study for 2020 Census enumerations in a sample of census tracts using data from the ACS for 2019–2021 and administrative records from a wide variety of sources—not just those used for enumeration in 2020. The goal would not be to generate another set of coverage estimates for 2020 but to learn as much possible about which groups of people show up in which kinds of data sources to generate testable hypotheses for methods to improve coverage in 2030, particularly for undercounted race and ethnic groups. The study could also examine the suitability of administrative records to improve the count of certain populations, such as young children and Medicare-age adults.

The advantage of using the ACS for matching is that it provides a much larger sample than the 2020 PES—a combined sample for 2019–2021 would total over 5 million households. It also includes group quarters, which the PES did not. ACS interviews farther away from Census Day would be harder to

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×

match because people may have moved or changed household composition in the interim. For analysis purposes, ACS interviews could perhaps be inversely weighted according to their distance in time from the census.

To hone in on duplicate enumerations in the census involving more than one address, the Census Bureau could match the set of 2020 Census records against itself. This would identify situations such as those involving people who move seasonally between homes in colder and warmer parts of the United States (“snowbird” and “sunbird” populations), other people with two or more homes, college students enumerated both at college and at home, children enumerated by both parents in joint custody arrangements, and households in apartment buildings enumerated more than once due to apartment number mix-ups. The goal would be to provide an evidence base for methods to reduce duplications involving more than one address in the 2030 Census. An added goal could be to develop a set of well-tested decision rules that would allow the Census Bureau to remove such duplicates, just as it has, for several censuses, used an algorithm to drop duplicate enumerations within a household.

4.4 CONCLUSIONS AND RECOMMENDATIONS

Conclusion 4.1: Comparing the 2010 and 2020 Census net population coverage estimates from Demographic Analysis (DA):

  • For total population, both censuses, based on the middle series of DA estimates, closely approximated the estimated population total (census erroneous enumerations almost offset census omissions).
  • For age and sex, the 2020 Census undercounted children ages 0–4 more than the 2010 Census. The 2020 Census also overcounted college-age people substantially more than the 2010 Census and, similar to the 2010 Census, overcounted people ages 50 and older.
  • To date, no estimates of net coverage error for the 2020 Census by race are available because production of the necessary census race data was delayed by development of confidentiality-protection routines.

Conclusion 4.2: Comparing the 2010 and 2020 Census population coverage estimates from the Post-Enumeration Survey (PES):

  • Differences among race and ethnic groups widened substantially in 2020, with adverse implications for use of the data to allocate fixed resources, such as representation, funding, and services. Some groups saw increases in net overcounts, particularly non-Hispanic White Alone people and Asian people, and other groups saw increases in net undercounts, particularly Black people,
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
  • Hispanic people, and American Indians and Alaska Natives on reservations.
  • Increases in net undercounts in 2020 for groups such as Black and Hispanic people were driven by increases in omissions from the census (as measured by a lower match rate for people in the P-sample to the E-sample in the PES). Omissions, in turn, were higher in census tracts with lower rates of self-response.
  • Net overcounts and undercounts were more pronounced for people of various groups in rental compared with owner housing, particularly in 2020 compared with 2010.

Conclusion 4.3: With respect to the quality of the 2020 Post-Enumeration Survey (PES) and Demographic Analysis (DA) estimates:

  • The quality of the 2020 PES deteriorated compared with 2010, due to smaller sample sizes, higher nonresponse rates, higher missing data rates, higher rates of unresolved match status and unresolved enumeration status, and delays caused by the COVID-19 pandemic.
  • While DA continued to set a standard for estimation of the total population in 2020, it showed a wide range of net coverage estimates, based on Medicare records and assumptions about completeness of coverage, for people ages 75 and older among the low, middle, and high series of estimates. The 2020 DA coverage estimates series for Hispanic and non-Hispanic children and young adults also showed a wide range, largely due to differences in metnods for assigning Hispanic births. Differences in net international migration assumptions also likely played a role, particularly for older ages.

Recommendation 4.1: For the Demographic Analysis (DA) coverage evaluation method, the U.S. Census Bureau should conduct research on:

  • The reasonableness of the assumption for incompleteness of Medicare enrollment in the high series, which produced coverage estimates that diverged greatly from the low and middle series for people ages 75 and older, through an appropriate match study;
  • The suitability of Medicare data for coverage evaluation of the entire population ages 65 and older as in the 2010 Census and for improving the census count of this age group in 2030;
  • The reasonableness of the methods for assigning Hispanic and non-Hispanic births for coverage estimates for chil-
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
  • dren and young adults, which produced wide differences between the low and high DA series in 2020;
  • Methods for narrowing estimates of net international migration, which affect estimates for Hispanic people and the total population; and
  • Methods for developing experimental subnational DA estimates, starting with young children.

Recommendation 4.2: The U.S. Census Bureau should conduct additional analyses of the 2020 Post-Enumeration Survey (PES) to learn as much as possible about census omissions. One such analysis should include P-sample and E-sample households with at least one member in common, to assess not only erroneous census enumerations but also omissions and how they vary with household characteristics and type of enumeration (e.g., proxy interview, internet response). In addition, the Census Bureau should perform discriminant or similar analyses to identify variables that contributed to census component errors (e.g., omission, duplication) for 2020 P-sample and E-sample addresses, households, and individuals, which could identify census processes to target for improvement in 2030.

Recommendation 4.3: For the Post-Enumeration Survey (PES) method, the U.S. Census Bureau should:

  • Perform sensitivity analyses, based on plausible assumptions, to put error bounds around such key operations as matching and imputation of match and enumeration status, to evaluate the quality of the 2020 PES and plan such analyses from the outset for the 2030 PES;
  • Plan analyses for the 2030 PES of components of error for census operations in exhaustive and mutually exclusive categories that are as comparable as possible between 2020 and 2030;
  • Seek adequate funding to increase the PES sample size in 2030 to at least 2010 levels; and
  • Experiment with modeling to estimate net undercoverage and overcoverage for more detailed geographic and demographic groups for which direct estimates could not be provided in 2020.

Recommendation 4.4: The U.S. Census Bureau should prioritize research on potential sources of coverage errors—both undercounts and overcounts—for geographic areas and population groups, using additional methods besides further analysis

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×

of the Post-Enumeration Survey. To address omissions from the census, the Census Bureau should match 2020 Census records with the 2019–2021 American Community Survey and a wide variety of administrative records for census tracts—perhaps sampling those with low self-response rates. The goal would be to provide an evidence base for methods and data sources that could potentially reduce omissions in the 2030 Census. To address duplications in the census, the Census Bureau should match 2020 Census records with themselves to identify duplicate enumerations of people with more than one residence, college students enumerated at college and at home, children in joint custody arrangements, and the like. The goal would be to provide an evidence base for methods to reduce duplicates in the 2030 Census and potentially to remove them from the count, just as people duplicated within a household are dropped from the count.

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×

This page intentionally left blank.

Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 69
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 70
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 71
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 72
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 73
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 74
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 75
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 76
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 77
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 78
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 79
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 80
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 81
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 82
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 83
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 84
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 85
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 86
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 87
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 88
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 89
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 90
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 91
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 92
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 93
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 94
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 95
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 96
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 97
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 98
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 99
Suggested Citation:"4 Coverage Measurement in the 2020 Census." National Academies of Sciences, Engineering, and Medicine. 2023. Assessing the 2020 Census: Final Report. Washington, DC: The National Academies Press. doi: 10.17226/27150.
×
Page 100
Next: 5 Master Address File »
Assessing the 2020 Census: Final Report Get This Book
×
 Assessing the 2020 Census: Final Report
Buy Paperback | $60.00 Buy Ebook | $48.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Since 1790, the U.S. census has been a recurring, essential civic ceremony in which everyone counts; it reaffirms a commitment to equality among all, as political representation is explicitly tied to population counts. Assessing the 2020 Census looks at the quality of the 2020 Census and its constituent operations, drawing appropriate comparisons with prior censuses. The report acknowledges the extraordinary challenges the Census Bureau faced in conducting the census and provides guidance as it plans for the 2030 Census. In addition, the report encourages research and development as the goals and designs for the 2030 Census are developed, urging the Census Bureau to establish a true partnership with census data users and government partners at the state, local, tribal, and federal levels.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

    « Back Next »
  6. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  7. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  8. ×

    View our suggested citation for this chapter.

    « Back Next »
  9. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!