Skip to main content

Currently Skimming:

5 Analytic Use of Coverage Measurement Data
Pages 119-136

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 119...
... Finally, the Census Bureau needs to perform analyses that take advantage of the wealth of information from CCM. This chapter first presents a framework for defin­ ing, estimating, and modeling the components of census coverage error.
From page 120...
... The Census Bureau should also develop plans for operationalizing the measurement of these components using data from the census and the census coverage measurement program. In addition to producing tabulations of net coverage error at some level roughly comparable to the poststrata used in 2000, the Census Bureau has started developing plans for producing tabulations of com­ ponents of census coverage error, including the percentages of erroneous enumeration, duplicates, missed enumerations, and people counted in the wrong area (e.g., in the wrong state)
From page 121...
... Toward this end, we outline a general approach to the statistical model­ing of component census coverage errors that we believe the ­Census Bureau should consider. Note that the statistical models outlined will be fit using data from the P-sample and from the associated E-sample enu­ merations, though efforts should be made to augment these models with data from administrative records data, the American Community Survey, and, if possible, other sources.
From page 122...
... For example, due to matching errors, some cases judged to be census duplicates in the coverage measurement program through use of the national match for duplicates will not be duplicates and so a few correct enumerations will be judged as being duplicates. For the same reason, some actual duplicates will be erroneously classified as correct enumerations.
From page 123...
... With these caveats, we discuss the primary predictors that might be valu­ able to include in statistical models for component coverage errors. Individuals and Housing Units  There are four individual and housing units variables that need to be considered.
From page 124...
... . None of these mistakes forces a census coverage error, since these mistakes can be, and often are, corrected during the field enumeration. However, it seems clear that indicator variables of these errors are very likely to be associated with a greater frequency of cov­ erage error.
From page 125...
... Three indicator variables related to the initial mail data collection might be associated with coverage errors: (1) an indicator that a foreign language questionnaire was requested or some other contact was made with telephone questionnaire assistance, (2)
From page 126...
... . Implementation To facilitate use of these various dependent and predictor variables, it is important for the census coverage measurement database to be struc­ tured to enable the representation of data at multiple levels -- including individual, household, local area, and the area covered by the Census' local enumeration office -- since various variables will exist at these dif­ ferent levels.
From page 127...
... There is a strong need for multiple approaches to the analysis of such data. It remains to be seen which kinds of exploratory data analyses or statistical modeling techniques are most helpful to the Census Bureau in identifying census deficiencies.
From page 128...
... Recommendation 10: In developing the logistic regression models or other types of discriminant-analysis models of match status, cor rect enumeration status, and components of census coverage error, the Census Bureau should consider: • Use of several approaches before focusing on a specific model; besides logistic regression, alternatives should include use of other link functions, discriminant analysis, and various data mining approaches, such as classification trees, support vector machines, and neural nets. • Thorough examination of the subset of predictors that is best suited to each individual statistical model; the predictors for these various statistical models need not be identical; however, there may be a benefit to constraining the (logistic regression)
From page 129...
... These models have the potential to make use of the entire cen­ sus database, rather than just the data from the coverage measurement program. In addition, given that proxy enumerations can be viewed as informed imputations, one might also develop statistical models that predict whether an enumeration is likely to be by proxy, again to assess which factors may be related and potentially causal.
From page 130...
... As described above, the coverage measurement program in 2010 will provide information not only on net coverage error, but also on the four components of census coverage error. With regard to net coverage error, somewhat analogous to 2000, the Census Bureau has proposed that it will   There will be inconsistencies between the tabulations for net coverage error and for components of census coverage error that need to be communicated to census data users. In particular, while net census coverage error uses DSE to estimate those missed by both the census and the CCM, since those omissions have no information locating them in a housing unit, they are excluded from tabulations of components of coverage error and associated analyses.
From page 131...
... Although census data users may not be asking about rates of components of census coverage error, the release of these tabulations will give data users a better sense of the totality of error in the census counts. However, the panel believes that the Census Bureau needs to go fur­ ther than the release of these tabulations, at least internally, in order to support the ultimate goal of a feedback loop that uses information from coverage measurement for identification of deficient census processes (and possibly even hints at preferred alternative processes)
From page 132...
... . Second, it would have the results of the census coverage measurement program assessment of whether the individual was properly counted, omitted, erroneously included, duplicated, or included at the wrong location.
From page 133...
... A caution, though, is that it is the nature of such explor­ atory analyses to "discover" situations that are illusory or patterns that are idiosyncratic aspects of the 2010 census. Recommendation 11: The primary output of the Census Bureau's coverage measurement program in 2010 should be an analytic data base that is used to support the development of statistical models to inform census process improvement.  The production of summary tabulations should be of lesser priority.
From page 134...
... The size of the CCM, and the general infrequency of the four components of census coverage error (in 2000, there were 37 million estimated omissions and overcounts of 283 million enumerations and as discussed often in this report, that is an overestimate of the gross error rate; see National Research Council, 2004b) crossed by census processes used, among other variables, will necessarily limit the value of the proposed database to address very detailed ques­ tions.
From page 135...
... Second, while most census data users will be uninterested in either developing their own models or even the models that the Census Bureau develops, there is going to be interest in why the Census Bureau is making various changes in the census coverage measurement program: therefore, the Census Bureau may find it useful to issue a research report series on major findings from the analyses (both internally and externally) of a CCM database.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.