COVERAGE MEASUREMENT IN THE 2010 CENSUS

Panel on Correlation Bias and Coverage Measurement in the 2010 Decennial Census

Robert M. Bell and Michael L. Cohen, Editors

Committee on National Statistics

Division of Behavioral and Social Sciences and Education

NATIONAL RESEARCH COUNCIL OF THE NATIONAL ACADEMIES

THE NATIONAL ACADEMIES PRESS

Washington, D.C.
www.nap.edu



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page R1
Panel on Correlation Bias and Coverage Measurement in the 2010 Decennial Census Robert M. Bell and Michael L. Cohen, Editors Committee on National Statistics Division of Behavioral and Social Sciences and Education

OCR for page R1
THE NATIONAL ACADEMIES PRESS 500 Fifth Street, N.W. Washington, DC 20001 NOTICE: The project that is the subject of this report was approved by the Govern­ ing Board of the National Research Council, whose members are drawn from the councils of the National Academy of Sciences, the National Academy of Engineer­ ing, and the Institute of Medicine. The members of the committee responsible for the report were chosen for their special competences and with regard for appropriate balance. The project that is the subject of this report was supported by contract no. YA1323­ 04­CN­0006 between the National Academy of Sciences and the U.S. Census Bureau. Support of the work of the Committee on National Statistics is provided by a consortium of federal agencies through a grant from the U.S. National Sci­ ence Foundation (Number SBR­0112521). Any opinion, findings, conclusions, or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the views of the organizations or agencies that provided support for the project. International Standard Book Number­13: 978­0­309­12826­1 International Standard Book Number­10: 0­309­12826­9 Additional copies of this report are available from The National Academies Press, 500 Fifth Street, NW, Lockbox 285, Washington, DC 20055 or (800) 624­6242 or (202) 334­3313 (in the Washington metropolitan area); Internet, http://www.nap. edu. Copyright 2009 by the National Academy of Sciences. All rights reserved. Printed in the United States of America Suggested citation: National Research Council (2009). Coverage Measurement in the 2010 Census. Panel on Correlation Bias and Coverage Measurement in the 2010 Decennial Census, Robert M. Bell and Michael L. Cohen (Eds.). Committee on National Statistics, Division of Behavioral and Social Sciences and Education. Washington, DC: The National Academies Press.

OCR for page R1
The National Academy of Sciences is a private, nonprofit, self­perpetuating society of distinguished scholars engaged in scientific and engineering research, dedicated to the furtherance of science and technology and to their use for the general welfare. Upon the authority of the charter granted to it by the Congress in 1863, the Academy has a mandate that requires it to advise the federal govern­ ment on scientific and technical matters. Dr. Ralph J. Cicerone is president of the National Academy of Sciences. The National Academy of Engineering was established in 1964, under the charter of the National Academy of Sciences, as a parallel organization of outstanding engineers. It is autonomous in its administration and in the selection of its mem­ bers, sharing with the National Academy of Sciences the responsibility for advis­ ing the federal government. The National Academy of Engineering also sponsors engineering programs aimed at meeting national needs, encourages education and research, and recognizes the superior achievements of engineers. Dr. Charles M. Vest is president of the National Academy of Engineering. The Institute of Medicine was established in 1970 by the National Academy of Sciences to secure the services of eminent members of appropriate professions in the examination of policy matters pertaining to the health of the public. The Institute acts under the responsibility given to the National Academy of Sciences by its congressional charter to be an adviser to the federal government and, upon its own initiative, to identify issues of medical care, research, and education. Dr. Harvey V. Fineberg is president of the Institute of Medicine. The National Research Council was organized by the National Academy of Sciences in 1916 to associate the broad community of science and technology with the Academy’s purposes of furthering knowledge and advising the federal government. Functioning in accordance with general policies determined by the Academy, the Council has become the principal operating agency of both the National Academy of Sciences and the National Academy of Engineering in pro­ viding services to the government, the public, and the scientific and engineering communities. The Council is administered jointly by both Academies and the Institute of Medicine. Dr. Ralph J. Cicerone and Dr. Charles M. Vest are chair and vice chair, respectively, of the National Research Council. www.national-academies.org

OCR for page R1

OCR for page R1
PANEL ON CORRELATION BIAS AND COvERAgE MEASuREMENT IN THE 2010 DECENNIAL CENSuS ROBERT BELL (Chair), AT&T Research Laboratories, Florham Park, New Jersey LAWRENCE BROWN, Department of Statistics, Wharton School, University of Pennsylvania RODERICk LITTLE, Departments of Biostatistics and Statistics, University of Michigan XIAO­LI MENG, Department of Statistics, Harvard University JEFFREY PASSEL, Pew Hispanic Center, Washington, DC DONALD YLVISAkER, Department of Statistics, University of California, Los Angeles (emeritus) ALAN M. ZASLAVSkY, Department of Health Care Policy, Harvard Medical School MICHAEL L. COHEN, Study Director DANIEL L. CORk, Senior Program Officer AGNES E. GASkIN, Senior Program Assistant BARBARA A. BAILAR, Consultant MEYER ZITTER, Consultant v

OCR for page R1
COMMITTEE ON NATIONAL STATISTICS 2007–2008 WILLIAM F. EDDY (Chair), Department of Statistics, Carnegie Mellon University kATHARINE ABRAHAM, Department of Economics and Joint Program in Survey Methodology, University of Maryland WILLIAM DuMOUCHEL, Lincoln Technologies, Inc., Waltham, Massachusetts JOHN HALTIWANGER, Department of Economics, University of Maryland V. JOSEPH HOTZ, Department of Economics, Duke University kAREN kAFADAR, Department of Statistics, Indiana University DOUGLAS MASSEY, Department of Sociology, Princeton University SALLY MORTON, RTI International, Research Triangle Park, North Carolina VIJAY NAIR, Department of Statistics and Department of Industrial and Operations Engineering, University of Michigan JOSEPH NEWHOUSE, Division of Health Policy Research and Education, Harvard University SAMUEL H. PRESTON, Population Studies Center, University of Pennsylvania kENNETH PREWITT, School of International and Public Affairs, Columbia University LOUISE RYAN, Department of Biostatistics, Harvard University ROGER TOURANGEAU, Joint Program in Survey Methodology, University of Maryland, and Survey Research Center, University of Michigan ALAN ZASLAVSkY, Department of Health Care Policy, Harvard University Medical School CONSTANCE F. CITRO, Director vi

OCR for page R1
Acknowledgments The Panel on Coverage Evaluation and Correlation Bias in the 2010 Census wishes to thank the many people who contributed to our work. The initial idea for the study came from Hermann Habermann, then deputy director of the Census Bureau. Many other Census Bureau person­ nel were also instrumental in providing assistance. The contracting officer for this study was Philip Gbur, whose efforts should serve as a model of how best to provide for smooth communications between a National Research Council (NRC) panel and its sponsor. Donna kostanich was extremely generous with her time and that of her staff, all of whom gave excellent summary presentations on the status of their various research efforts. Along with Philip Gbur, Donna kostanich established a friendly, collegial environment between her staff and the panel. We thank the staff of the Census Bureau’s census coverage and cov­ erage measurement group for their presentations: Tamara Adams, Paul Livermore Auer, William Bell, Pete Davis, Gregg Diffendal, James Farber, Rick Griffin, Tom Mule, Mary Mulry, Sally Obenski, Doug Olson, Robin Pennington, Preston J. Waite, and David Whitford. The Census Bureau also provided on­site access to the A.C.E. Research Database. Huilin Li of the University of Maryland carried out many difficult computations on this database at the direction of the panel and staff, and we thank her for her patience and expertise. Stephanie Jaros of the University of Washington provided a comprehensive bibliography on ethnography and census undercoverage, and also provided an excellent vii

OCR for page R1
viii ACKNOWLEDGMENTS paper summarizing ethnographic information on intentional reasons for undercoverage in the decennial census. As consultant to the panel, Barbara Bailar provided important insights on the history of coverage measurement and its implications for 2010. Also, Roger Tourangeau, member of a sister NRC panel on residence rules in the decennial census, assisted the panel in learning about probes for the possibility of alternative residences on both the coverage follow­up interview and the census coverage measurement interview. The panel is indebted to Eugenia Grohman of the staff of NRC’s Divi­ sion of Behavioral and Social Sciences and Education for her expert tech­ nical editing of the draft report. Also, NRC staff Christine Chen, Lance Hunter, and Agnes Gaskin, as always, provided excellent administrative support for the panel. We are especially grateful to the project’s study director, Michael Cohen, who coordinated both the information gathering and report writ­ ing processes for the panel. He did a superb job of organizing the panel’s often disjointed observations to facilitate creation of this final report. We would also like to thank Dan Cork for helping to organize and oversee the conduct of the meetings of the panel, and for greatly improving the appearance of the panel’s reports, and we are extremely grateful to Connie Citro for helping to oversee all aspects of the study from its incep­ tion to publication of the final report, asking very perceptive questions during the panel’s meetings, rewriting part of the executive summary, and providing enormously useful advice whenever difficult situations arose. This report has been reviewed in draft form by individuals chosen for their diverse perspectives and technical expertise, in accordance with pro­ cedures approved by the NRC’s Report Review Committee. The purpose of this independent review is to provide candid and critical comments that will assist the institution in making the published report as sound as possible and to ensure that the report meets institutional standards for objectivity, evidence, and responsiveness to the study charge. The review comments and draft manuscript remain confidential to protect the integrity of the deliberative process. We thank the following individuals for their participation in the review of this report: Margo Anderson, Department of History, University of Wisconsin; Eugene P. Ericksen, Department of Sociology, Temple Univer­ sity; David McMillen, External Affairs Liaison’s Office, National Archives and Records Administration, Washington, DC; Colm A. O’Muircheartaigh, Harris Graduate School of Public Policy Studies, The University of Chicago; keith Rust, Westat, Inc., Rockville, MD; Herbert L. Smith, Population Studies Center, University of Pennsylvania; and Martin T. Wells, Depart­ ment of Social Statistics, Cornell University.

OCR for page R1
ix ACKNOWLEDGMENTS Although the reviewers listed above provided many constructive com­ ments and suggestions, they were not asked to endorse the conclusions or recommendations, nor did they see the final draft of the report before its release. The review of the report was overseen by Henry Riecken, profes­ sor of behavioral sciences, emeritus, University of Pennsylvania, and John Rolph, Marshall School of Business, University of Southern California. Appointed by the NRC, they were responsible for making certain that an independent examination of the report was carried out in accordance with institutional procedures and that all review comments were carefully considered. Responsibility for the final content of this report, however, rests entirely with the authoring panel and the institution. Robert M. Bell, Chair Panel on Correlation Bias and Coverage Measurement in the 2010 Decennial Census

OCR for page R1

OCR for page R1
Contents GLOSSARY OF TECHNICAL TERMS xiii EXECUTIVE SUMMARY 1 1 INTRODUCTION 7 Program Objectives, 7 Panel Charge and Work Plan, 9 Plan of the Report, 12 2 FUNDAMENTALS OF COVERAGE MEASUREMENT 15 Types of Census Errors, 16 Coverage Error Metrics for Aggregates, 19 Purposes, 22 Description and History, 27 Demographic Analysis, 36 1950–1990 Censuses, 40 2000 Census, 45 3 PLANS FOR THE 2010 CENSUS 55 Major Design Changes, 55 Treatment of Duplicates, 59 Contamination, 67 Administrative Records, 73 xi

OCR for page R1
xii CONTENTS 4 TECHNICAL ISSUES 81 Sample Design for Census Coverage Measurement, 81 Logistic Regression Models, 89 Missing Data in Net Coverage Error Models, 102 Matching Cases with Minimal Information, 109 Demographic Analysis, 111 5 ANALYTIC USE OF COVERAGE MEASUREMENT DATA 119 Framework for Understanding Coverage Errors, 119 Statistical Modeling, 121 Census Data Products for Process Improvement, 130 REFERENCES AND BIBLIOGRAPHY 137 APPENDIXES A A Framework for Components of Census Coverage Error 145 B Logistic Regression for Modeling Match and Correct 153 Enumeration Rates C Biographical Sketches of Panel Members and Staff 157

OCR for page R1
Glossary of Technical Terms Accuracy and coverage evaluation (A.C.E.). The coverage measurement program based on dual­systems estimation that was used to evaluate the coverage of the 2000 census. Adjustment. The use of information from coverage measurement pro­ grams to modify census counts in an attempt to correct for coverage errors in the census. Administrative records. Data in administrative files that are used to help administer governmental programs (e.g., to assess eligibility and for funds distribution). American Community Survey (ACS). An unclustered continuous house­ hold survey that collects information similar to that collected on the old decennial census long form. (Estimates were first available in 2006 from information collected in 2005.) Be Counted. A decennial census coverage improvement program that provides questionnaires in public locations for individuals to fill out and return if they believe they were missed in the census. Block clusters. Collections of roughly 30 contiguous housing units that the Census Bureau creates for all U.S. households. In urban areas, these are often individual city blocks. xiii

OCR for page R1
xiv GLOSSARY OF TECHNICAL TERMS Census coverage measurement (CCM). The coverage measurement pro­ gram that will be used to evaluate the coverage of the 2010 census; also, the postenumeration survey and other parts of the 2010 coverage mea­ surement program. Classification and regression trees. A method for fitting either a categori­ cal or continuous response by developing a decision tree that determines subsets of cases whose most frequent responses (for categorical responses) or whose average values (for continuous responses) are used to provide fitted values. Components of census coverage error. The four possible census errors: omissions, erroneous enumerations, duplications, and enumerations in the wrong location. Contamination. A situation in which the census processes carried out in the postenumeration block clusters are different from those in the remainder of the country in ways that may affect census counts for those blocks. Correlation bias. The bias in dual­systems estimation that is due to the correlation of the individual enumeration propensities for the census and those for the postenumeration survey. Coverage evaluation. The process of developing a quantitative or qualita­ tive assessment of the quality of the counts in a census. Coverage follow-up interview. A telephone interview in the 2010 census that will follow up those households for which there is information that a coverage error may have occurred. It will also be used for households with more than six members. (This interview in a sense replaces the coverage edit follow­up interview in the 2000 census, which followed up large households and households that had discrepancies between the indicated household size and the number of people for whom individual characteristics were provided.) Coverage measurement. The process of developing a quantitative assess­ ment of the quality of the counts in a census; hence, a part of coverage evaluation. Data-defined enumeration. A census enumeration for which two non­ imputed characteristics have been collected.

OCR for page R1
xv GLOSSARY OF TECHNICAL TERMS Demographic analysis. An approach to coverage measurement that bridges the counts for a demographic group from one census to the next through the addition of births and immigrants and the subtraction of deaths and emigrants. Differential undercount. The difference between the net undercount for a particular demographic or geographic domain and the net undercount either for another domain or for the nation. (See also Net undercount and undercount.) Discriminant analysis. A statistical model that uses a set of variables to construct a function that fits a dichotomous response typically by provid­ ing probabilities that a case was a member of one of the two groups. Domain. A collection of individuals defined by various characteristics, usually geographic and demographic. Dual-systems estimation. An approach to coverage measurement that uses the census and a postenumeration survey as two independent enu­ merations of a population. The two enumerations are matched to each other to determine how many are identical, with the results input into a statistical model (referred to in other contexts as capture­recapture). Erroneous enumerations. Individuals enumerated in the decennial cen­ sus for whom no enumeration should have been made (e.g., short­term visitors). E-sample. Generally, the sample of enumerations in the decennial census corresponding to households located in the P­sample block clusters. Fourth cell. The group of individuals in dual­systems estimation that are missed in both the census and in the postenumeration survey. geocoding error. Misidentification of the census block in which an address is physically located. gross census error. The total number of both undercounting and over­ counting errors for a domain. Imputation. A technique that “fills in” values for nonresponses, usually based on information collected for other, complete data cases.

OCR for page R1
xvi GLOSSARY OF TECHNICAL TERMS KE enumerations. In the 2000 census, enumerations that were data defined but considered to be insufficient for purposes of matching in dual­systems estimation. Logistic regression. A statistical model that uses a logistic function of a linear combination of covariates to estimate the probability that a case was a member of one of two groups. Master Address File (MAF). The Census Bureau’s current collection of addresses for all U.S. residents and businesses. It is used to develop the Decennial Master Address File, which supports the decennial census mailout operation. (See also Topologically Integrated geographic Encod- ing and Referencing System.) Net undercount. The difference between the census undercount and census overcount, often expressed as a rate. (See also Differential under- count and undercount.) Nonresponse follow-up. A decennial census operation used to interview households that failed to fill out a census questionnaire. Omissions. Individuals whom the decennial census failed to enumerate who should have been enumerated. Postenumeration survey (PES). A national survey that is operationally independent of the decennial census, taken shortly after the census has concluded its data collection, for purposes of coverage evaluation based on dual­systems estimation. Poststratification. The use of covariates to define what are believed to be more homogeneous subgroups of the population for which separate dual­ systems estimation computations are conducted. Proxy enumeration. Information collected from landlords or neighbors in the place of information that was intended to be collected from a household’s residents. P-sample. Generally, the residents of the set of households located in the sample of block clusters selected to be included in the postenumeration survey to support dual­systems estimation.

OCR for page R1
xvii GLOSSARY OF TECHNICAL TERMS Statistical Administrative Records System (StARS). A national roster constructed for research purposes by the Census Bureau that links people to current addresses by merging and unduplicating a number of admin­ istrative records. Synthetic estimation. A statistical method for small­area estimation that assumes that net undercount rates for demographic groups in a domain apply without change to all geographic subsets of that domain. Topologically Integrated geographic Encoding and Referencing System (TIgER). A geographic information system that links a given address to a physical location defined by city blocks, roads and railroads, natural boundaries, and political boundaries. undercount. The measurement of either the number or rate of individuals missed in the census that should have been enumerated. (See also Dif- ferential undercount and Net undercount.)

OCR for page R1