Robert M. Bell and Michael L. Cohen, Editors
THE NATIONAL ACADEMIES PRESS
Washington, D.C.
www.nap.edu
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page R1
Panel on Correlation Bias and Coverage Measurement in the
2010 Decennial Census
Robert M. Bell and Michael L. Cohen, Editors
Committee on National Statistics
Division of Behavioral and Social Sciences and Education
OCR for page R1
THE NATIONAL ACADEMIES PRESS 500 Fifth Street, N.W. Washington, DC 20001
NOTICE: The project that is the subject of this report was approved by the Govern
ing Board of the National Research Council, whose members are drawn from the
councils of the National Academy of Sciences, the National Academy of Engineer
ing, and the Institute of Medicine. The members of the committee responsible
for the report were chosen for their special competences and with regard for
appropriate balance.
The project that is the subject of this report was supported by contract no. YA1323
04CN0006 between the National Academy of Sciences and the U.S. Census
Bureau. Support of the work of the Committee on National Statistics is provided
by a consortium of federal agencies through a grant from the U.S. National Sci
ence Foundation (Number SBR0112521). Any opinion, findings, conclusions, or
recommendations expressed in this publication are those of the author(s) and do
not necessarily reflect the views of the organizations or agencies that provided
support for the project.
International Standard Book Number13: 9780309128261
International Standard Book Number10: 0309128269
Additional copies of this report are available from The National Academies Press,
500 Fifth Street, NW, Lockbox 285, Washington, DC 20055 or (800) 6246242 or
(202) 3343313 (in the Washington metropolitan area); Internet, http://www.nap.
edu.
Copyright 2009 by the National Academy of Sciences. All rights reserved.
Printed in the United States of America
Suggested citation: National Research Council (2009). Coverage Measurement in
the 2010 Census. Panel on Correlation Bias and Coverage Measurement in the
2010 Decennial Census, Robert M. Bell and Michael L. Cohen (Eds.). Committee
on National Statistics, Division of Behavioral and Social Sciences and Education.
Washington, DC: The National Academies Press.
OCR for page R1
The National Academy of Sciences is a private, nonprofit, selfperpetuating
society of distinguished scholars engaged in scientific and engineering research,
dedicated to the furtherance of science and technology and to their use for the
general welfare. Upon the authority of the charter granted to it by the Congress
in 1863, the Academy has a mandate that requires it to advise the federal govern
ment on scientific and technical matters. Dr. Ralph J. Cicerone is president of the
National Academy of Sciences.
The National Academy of Engineering was established in 1964, under the charter
of the National Academy of Sciences, as a parallel organization of outstanding
engineers. It is autonomous in its administration and in the selection of its mem
bers, sharing with the National Academy of Sciences the responsibility for advis
ing the federal government. The National Academy of Engineering also sponsors
engineering programs aimed at meeting national needs, encourages education
and research, and recognizes the superior achievements of engineers. Dr. Charles
M. Vest is president of the National Academy of Engineering.
The Institute of Medicine was established in 1970 by the National Academy of
Sciences to secure the services of eminent members of appropriate professions
in the examination of policy matters pertaining to the health of the public. The
Institute acts under the responsibility given to the National Academy of Sciences
by its congressional charter to be an adviser to the federal government and, upon
its own initiative, to identify issues of medical care, research, and education.
Dr. Harvey V. Fineberg is president of the Institute of Medicine.
The National Research Council was organized by the National Academy of
Sciences in 1916 to associate the broad community of science and technology
with the Academy’s purposes of furthering knowledge and advising the federal
government. Functioning in accordance with general policies determined by the
Academy, the Council has become the principal operating agency of both the
National Academy of Sciences and the National Academy of Engineering in pro
viding services to the government, the public, and the scientific and engineering
communities. The Council is administered jointly by both Academies and the
Institute of Medicine. Dr. Ralph J. Cicerone and Dr. Charles M. Vest are chair and
vice chair, respectively, of the National Research Council.
www.national-academies.org
OCR for page R1
OCR for page R1
PANEL ON CORRELATION BIAS AND COvERAgE
MEASuREMENT IN THE 2010 DECENNIAL CENSuS
ROBERT BELL (Chair), AT&T Research Laboratories, Florham Park,
New Jersey
LAWRENCE BROWN, Department of Statistics, Wharton School,
University of Pennsylvania
RODERICk LITTLE, Departments of Biostatistics and Statistics,
University of Michigan
XIAOLI MENG, Department of Statistics, Harvard University
JEFFREY PASSEL, Pew Hispanic Center, Washington, DC
DONALD YLVISAkER, Department of Statistics, University of
California, Los Angeles (emeritus)
ALAN M. ZASLAVSkY, Department of Health Care Policy, Harvard
Medical School
MICHAEL L. COHEN, Study Director
DANIEL L. CORk, Senior Program Officer
AGNES E. GASkIN, Senior Program Assistant
BARBARA A. BAILAR, Consultant
MEYER ZITTER, Consultant
v
OCR for page R1
COMMITTEE ON NATIONAL STATISTICS
2007–2008
WILLIAM F. EDDY (Chair), Department of Statistics, Carnegie Mellon
University
kATHARINE ABRAHAM, Department of Economics and Joint Program
in Survey Methodology, University of Maryland
WILLIAM DuMOUCHEL, Lincoln Technologies, Inc., Waltham,
Massachusetts
JOHN HALTIWANGER, Department of Economics, University of
Maryland
V. JOSEPH HOTZ, Department of Economics, Duke University
kAREN kAFADAR, Department of Statistics, Indiana University
DOUGLAS MASSEY, Department of Sociology, Princeton University
SALLY MORTON, RTI International, Research Triangle Park, North
Carolina
VIJAY NAIR, Department of Statistics and Department of Industrial and
Operations Engineering, University of Michigan
JOSEPH NEWHOUSE, Division of Health Policy Research and
Education, Harvard University
SAMUEL H. PRESTON, Population Studies Center, University of
Pennsylvania
kENNETH PREWITT, School of International and Public Affairs,
Columbia University
LOUISE RYAN, Department of Biostatistics, Harvard University
ROGER TOURANGEAU, Joint Program in Survey Methodology,
University of Maryland, and Survey Research Center, University of
Michigan
ALAN ZASLAVSkY, Department of Health Care Policy, Harvard
University Medical School
CONSTANCE F. CITRO, Director
vi
OCR for page R1
Acknowledgments
The Panel on Coverage Evaluation and Correlation Bias in the 2010
Census wishes to thank the many people who contributed to our work.
The initial idea for the study came from Hermann Habermann, then
deputy director of the Census Bureau. Many other Census Bureau person
nel were also instrumental in providing assistance. The contracting officer
for this study was Philip Gbur, whose efforts should serve as a model
of how best to provide for smooth communications between a National
Research Council (NRC) panel and its sponsor. Donna kostanich was
extremely generous with her time and that of her staff, all of whom gave
excellent summary presentations on the status of their various research
efforts. Along with Philip Gbur, Donna kostanich established a friendly,
collegial environment between her staff and the panel.
We thank the staff of the Census Bureau’s census coverage and cov
erage measurement group for their presentations: Tamara Adams, Paul
Livermore Auer, William Bell, Pete Davis, Gregg Diffendal, James Farber,
Rick Griffin, Tom Mule, Mary Mulry, Sally Obenski, Doug Olson, Robin
Pennington, Preston J. Waite, and David Whitford. The Census Bureau
also provided onsite access to the A.C.E. Research Database.
Huilin Li of the University of Maryland carried out many difficult
computations on this database at the direction of the panel and staff,
and we thank her for her patience and expertise. Stephanie Jaros of the
University of Washington provided a comprehensive bibliography on
ethnography and census undercoverage, and also provided an excellent
vii
OCR for page R1
viii ACKNOWLEDGMENTS
paper summarizing ethnographic information on intentional reasons for
undercoverage in the decennial census.
As consultant to the panel, Barbara Bailar provided important insights
on the history of coverage measurement and its implications for 2010.
Also, Roger Tourangeau, member of a sister NRC panel on residence rules
in the decennial census, assisted the panel in learning about probes for
the possibility of alternative residences on both the coverage followup
interview and the census coverage measurement interview.
The panel is indebted to Eugenia Grohman of the staff of NRC’s Divi
sion of Behavioral and Social Sciences and Education for her expert tech
nical editing of the draft report. Also, NRC staff Christine Chen, Lance
Hunter, and Agnes Gaskin, as always, provided excellent administrative
support for the panel.
We are especially grateful to the project’s study director, Michael
Cohen, who coordinated both the information gathering and report writ
ing processes for the panel. He did a superb job of organizing the panel’s
often disjointed observations to facilitate creation of this final report. We
would also like to thank Dan Cork for helping to organize and oversee
the conduct of the meetings of the panel, and for greatly improving
the appearance of the panel’s reports, and we are extremely grateful to
Connie Citro for helping to oversee all aspects of the study from its incep
tion to publication of the final report, asking very perceptive questions
during the panel’s meetings, rewriting part of the executive summary, and
providing enormously useful advice whenever difficult situations arose.
This report has been reviewed in draft form by individuals chosen for
their diverse perspectives and technical expertise, in accordance with pro
cedures approved by the NRC’s Report Review Committee. The purpose
of this independent review is to provide candid and critical comments
that will assist the institution in making the published report as sound
as possible and to ensure that the report meets institutional standards
for objectivity, evidence, and responsiveness to the study charge. The
review comments and draft manuscript remain confidential to protect the
integrity of the deliberative process.
We thank the following individuals for their participation in the review
of this report: Margo Anderson, Department of History, University of
Wisconsin; Eugene P. Ericksen, Department of Sociology, Temple Univer
sity; David McMillen, External Affairs Liaison’s Office, National Archives
and Records Administration, Washington, DC; Colm A. O’Muircheartaigh,
Harris Graduate School of Public Policy Studies, The University of Chicago;
keith Rust, Westat, Inc., Rockville, MD; Herbert L. Smith, Population
Studies Center, University of Pennsylvania; and Martin T. Wells, Depart
ment of Social Statistics, Cornell University.
OCR for page R1
ix
ACKNOWLEDGMENTS
Although the reviewers listed above provided many constructive com
ments and suggestions, they were not asked to endorse the conclusions or
recommendations, nor did they see the final draft of the report before its
release. The review of the report was overseen by Henry Riecken, profes
sor of behavioral sciences, emeritus, University of Pennsylvania, and John
Rolph, Marshall School of Business, University of Southern California.
Appointed by the NRC, they were responsible for making certain that
an independent examination of the report was carried out in accordance
with institutional procedures and that all review comments were carefully
considered. Responsibility for the final content of this report, however,
rests entirely with the authoring panel and the institution.
Robert M. Bell, Chair
Panel on Correlation Bias and
Coverage Measurement in the 2010
Decennial Census
OCR for page R1
OCR for page R1
Contents
GLOSSARY OF TECHNICAL TERMS xiii
EXECUTIVE SUMMARY 1
1 INTRODUCTION 7
Program Objectives, 7
Panel Charge and Work Plan, 9
Plan of the Report, 12
2 FUNDAMENTALS OF COVERAGE MEASUREMENT 15
Types of Census Errors, 16
Coverage Error Metrics for Aggregates, 19
Purposes, 22
Description and History, 27
Demographic Analysis, 36
1950–1990 Censuses, 40
2000 Census, 45
3 PLANS FOR THE 2010 CENSUS 55
Major Design Changes, 55
Treatment of Duplicates, 59
Contamination, 67
Administrative Records, 73
xi
OCR for page R1
xii CONTENTS
4 TECHNICAL ISSUES 81
Sample Design for Census Coverage Measurement, 81
Logistic Regression Models, 89
Missing Data in Net Coverage Error Models, 102
Matching Cases with Minimal Information, 109
Demographic Analysis, 111
5 ANALYTIC USE OF COVERAGE MEASUREMENT DATA 119
Framework for Understanding Coverage Errors, 119
Statistical Modeling, 121
Census Data Products for Process Improvement, 130
REFERENCES AND BIBLIOGRAPHY 137
APPENDIXES
A A Framework for Components of Census Coverage Error 145
B Logistic Regression for Modeling Match and Correct 153
Enumeration Rates
C Biographical Sketches of Panel Members and Staff 157
OCR for page R1
Glossary of Technical Terms
Accuracy and coverage evaluation (A.C.E.). The coverage measurement
program based on dualsystems estimation that was used to evaluate the
coverage of the 2000 census.
Adjustment. The use of information from coverage measurement pro
grams to modify census counts in an attempt to correct for coverage errors
in the census.
Administrative records. Data in administrative files that are used to
help administer governmental programs (e.g., to assess eligibility and for
funds distribution).
American Community Survey (ACS). An unclustered continuous house
hold survey that collects information similar to that collected on the old
decennial census long form. (Estimates were first available in 2006 from
information collected in 2005.)
Be Counted. A decennial census coverage improvement program that
provides questionnaires in public locations for individuals to fill out and
return if they believe they were missed in the census.
Block clusters. Collections of roughly 30 contiguous housing units that
the Census Bureau creates for all U.S. households. In urban areas, these
are often individual city blocks.
xiii
OCR for page R1
xiv GLOSSARY OF TECHNICAL TERMS
Census coverage measurement (CCM). The coverage measurement pro
gram that will be used to evaluate the coverage of the 2010 census; also,
the postenumeration survey and other parts of the 2010 coverage mea
surement program.
Classification and regression trees. A method for fitting either a categori
cal or continuous response by developing a decision tree that determines
subsets of cases whose most frequent responses (for categorical responses)
or whose average values (for continuous responses) are used to provide
fitted values.
Components of census coverage error. The four possible census errors:
omissions, erroneous enumerations, duplications, and enumerations in
the wrong location.
Contamination. A situation in which the census processes carried out
in the postenumeration block clusters are different from those in the
remainder of the country in ways that may affect census counts for those
blocks.
Correlation bias. The bias in dualsystems estimation that is due to the
correlation of the individual enumeration propensities for the census and
those for the postenumeration survey.
Coverage evaluation. The process of developing a quantitative or qualita
tive assessment of the quality of the counts in a census.
Coverage follow-up interview. A telephone interview in the 2010 census
that will follow up those households for which there is information that
a coverage error may have occurred. It will also be used for households
with more than six members. (This interview in a sense replaces the
coverage edit followup interview in the 2000 census, which followed
up large households and households that had discrepancies between the
indicated household size and the number of people for whom individual
characteristics were provided.)
Coverage measurement. The process of developing a quantitative assess
ment of the quality of the counts in a census; hence, a part of coverage
evaluation.
Data-defined enumeration. A census enumeration for which two non
imputed characteristics have been collected.
OCR for page R1
xv
GLOSSARY OF TECHNICAL TERMS
Demographic analysis. An approach to coverage measurement that
bridges the counts for a demographic group from one census to the next
through the addition of births and immigrants and the subtraction of
deaths and emigrants.
Differential undercount. The difference between the net undercount for
a particular demographic or geographic domain and the net undercount
either for another domain or for the nation. (See also Net undercount and
undercount.)
Discriminant analysis. A statistical model that uses a set of variables to
construct a function that fits a dichotomous response typically by provid
ing probabilities that a case was a member of one of the two groups.
Domain. A collection of individuals defined by various characteristics,
usually geographic and demographic.
Dual-systems estimation. An approach to coverage measurement that
uses the census and a postenumeration survey as two independent enu
merations of a population. The two enumerations are matched to each
other to determine how many are identical, with the results input into a
statistical model (referred to in other contexts as capturerecapture).
Erroneous enumerations. Individuals enumerated in the decennial cen
sus for whom no enumeration should have been made (e.g., shortterm
visitors).
E-sample. Generally, the sample of enumerations in the decennial census
corresponding to households located in the Psample block clusters.
Fourth cell. The group of individuals in dualsystems estimation that are
missed in both the census and in the postenumeration survey.
geocoding error. Misidentification of the census block in which an
address is physically located.
gross census error. The total number of both undercounting and over
counting errors for a domain.
Imputation. A technique that “fills in” values for nonresponses, usually
based on information collected for other, complete data cases.
OCR for page R1
xvi GLOSSARY OF TECHNICAL TERMS
KE enumerations. In the 2000 census, enumerations that were data
defined but considered to be insufficient for purposes of matching in
dualsystems estimation.
Logistic regression. A statistical model that uses a logistic function of a
linear combination of covariates to estimate the probability that a case was
a member of one of two groups.
Master Address File (MAF). The Census Bureau’s current collection of
addresses for all U.S. residents and businesses. It is used to develop
the Decennial Master Address File, which supports the decennial census
mailout operation. (See also Topologically Integrated geographic Encod-
ing and Referencing System.)
Net undercount. The difference between the census undercount and
census overcount, often expressed as a rate. (See also Differential under-
count and undercount.)
Nonresponse follow-up. A decennial census operation used to interview
households that failed to fill out a census questionnaire.
Omissions. Individuals whom the decennial census failed to enumerate
who should have been enumerated.
Postenumeration survey (PES). A national survey that is operationally
independent of the decennial census, taken shortly after the census has
concluded its data collection, for purposes of coverage evaluation based
on dualsystems estimation.
Poststratification. The use of covariates to define what are believed to be
more homogeneous subgroups of the population for which separate dual
systems estimation computations are conducted.
Proxy enumeration. Information collected from landlords or neighbors
in the place of information that was intended to be collected from a
household’s residents.
P-sample. Generally, the residents of the set of households located in the
sample of block clusters selected to be included in the postenumeration
survey to support dualsystems estimation.
OCR for page R1
xvii
GLOSSARY OF TECHNICAL TERMS
Statistical Administrative Records System (StARS). A national roster
constructed for research purposes by the Census Bureau that links people
to current addresses by merging and unduplicating a number of admin
istrative records.
Synthetic estimation. A statistical method for smallarea estimation that
assumes that net undercount rates for demographic groups in a domain
apply without change to all geographic subsets of that domain.
Topologically Integrated geographic Encoding and Referencing System
(TIgER). A geographic information system that links a given address to
a physical location defined by city blocks, roads and railroads, natural
boundaries, and political boundaries.
undercount. The measurement of either the number or rate of individuals
missed in the census that should have been enumerated. (See also Dif-
ferential undercount and Net undercount.)
OCR for page R1