National Academies Press: OpenBook
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2017. Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps. Washington, DC: The National Academies Press. doi: 10.17226/24893.
×

FEDERAL STATISTICS,
MULTIPLE DATA
SOURCES, AND
PRIVACY PROTECTION

Next Steps

Panel on Improving Federal Statistics for
Policy and Social Science Research Using
Multiple Data Sources and State-of-the-Art Estimation Methods

Robert M. Groves and Brian A. Harris-Kojetin, Editors

Committee on National Statistics

Division of Behavioral and Social Sciences and Education

A Consensus Study Report of

images

THE NATIONAL ACADEMIES PRESS
Washington, DC
www.nap.edu

Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2017. Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps. Washington, DC: The National Academies Press. doi: 10.17226/24893.
×

THE NATIONAL ACADEMIES PRESS 500 Fifth Street, NW Washington, DC 20001

This activity was supported by a grant from the Laura and John Arnold Foundation with additional support from the National Academy of Sciences Kellogg Fund. Support for the work of the Committee on National Statistics is provided by a consortium of federal agencies through a grant from the National Science Foundation, a National Agricultural Statistics Service cooperative agreement, and several individual contracts. Any opinions, findings, conclusions, or recommendations expressed in this publication do not necessarily reflect the views of any organization or agency that provided support for the project.

International Standard Book Number-13: 978-0-309-46537-3
International Standard Book Number-10: 0-309-46537-0
Digital Object Identifier: https://doi.org/10.17226/24893

Additional copies of this report are available for sale from the National Academies Press, 500 Fifth Street, NW, Keck 360, Washington, DC 20001; (800) 624-6242 or (202) 334-3313; http://www.nap.edu/.

Copyright 2017 by the National Academy of Sciences. All rights reserved.

Printed in the United States of America

Suggested citation: National Academies of Sciences, Engineering, and Medicine. (2017). Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps. Washington, DC: The National Academies Press. doi: https://doi.org/10.17226/24893.

Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2017. Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps. Washington, DC: The National Academies Press. doi: 10.17226/24893.
×

images

The National Academy of Sciences was established in 1863 by an Act of Congress, signed by President Lincoln, as a private, nongovernmental institution to advise the nation on issues related to science and technology. Members are elected by their peers for outstanding contributions to research. Dr. Marcia McNutt is president.

The National Academy of Engineering was established in 1964 under the charter of the National Academy of Sciences to bring the practices of engineering to advising the nation. Members are elected by their peers for extraordinary contributions to engineering. Dr. C. D. Mote, Jr., is president.

The National Academy of Medicine (formerly the Institute of Medicine) was established in 1970 under the charter of the National Academy of Sciences to advise the nation on medical and health issues. Members are elected by their peers for distinguished contributions to medicine and health. Dr. Victor J. Dzau is president.

The three Academies work together as the National Academies of Sciences, Engineering, and Medicine to provide independent, objective analysis and advice to the nation and conduct other activities to solve complex problems and inform public policy decisions. The National Academies also encourage education and research, recognize outstanding contributions to knowledge, and increase public understanding in matters of science, engineering, and medicine.

Learn more about the National Academies of Sciences, Engineering, and Medicine at www.nationalacademies.org.

Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2017. Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps. Washington, DC: The National Academies Press. doi: 10.17226/24893.
×

images

Consensus Study Reports published by the National Academies of Sciences, Engineering, and Medicine document the evidence-based consensus on the study’s statement of task by an authoring committee of experts. Reports typically include findings, conclusions, and recommendations based on information gathered by the committee and the committee’s deliberations. Each report has been subjected to a rigorous and independent peer-review process and it represents the position of the National Academies on the statement of task.

Proceedings published by the National Academies of Sciences, Engineering, and Medicine chronicle the presentations and discussions at a workshop, symposium, or other event convened by the National Academies. The statements and opinions contained in proceedings are those of the participants and are not endorsed by other participants, the planning committee, or the National Academies.

For information about other products and activities of the National Academies, please visit www.nationalacademies.org/about/whatwedo.

Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2017. Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps. Washington, DC: The National Academies Press. doi: 10.17226/24893.
×

PANEL ON IMPROVING FEDERAL STATISTICS FOR POLICY AND SOCIAL SCIENCE RESEARCH USING MULTIPLE DATA SOURCES AND STATE-OF-THE-ART ESTIMATION METHODS

ROBERT M. GROVES (Chair), Office of the Provost, Department of Mathematics and Statistics, and Department of Sociology, Georgetown University

MICHAEL E. CHERNEW, Department of Health Care Policy, Harvard Medical School

PIET DAAS, Department of Corporate Services, Information Technology and Methodology, Statistics Netherlands

CYNTHIA DWORK, John A. Paulson School of Engineering and Applied Sciences, and Radcliffe Institute for Advanced Study, Harvard University

OPHIR FRIEDER, Department of Computer Science, Georgetown University

HOSAGRAHAR V. JAGADISH, Computer Science and Engineering, University of Michigan

FRAUKE KREUTER, Joint Program in Survey Methodology, University of Maryland, and Statistics and Methodology, University of Mannheim and Institute for Employment Research

SHARON LOHR, Westat, Rockville, MD

JAMES P. LYNCH, Department of Criminology and Criminal Justice, University of Maryland

COLM O’MUIRCHEARTAIGH, Harris School of Public Policy Studies, University of Chicago

TRIVELLORE RAGHUNATHAN, Institute for Social Research, University of Michigan

ROBERTO RIGOBON, Sloan School of Management, Massachusetts Institute of Technology

MARC ROTENBERG, Electronic Privacy Information Center, Washington, DC

BRIAN HARRIS-KOJETIN, Study Director

HERMANN HABERMANN, Senior Program Officer

GEORGE SCHOEFFEL, Research Assistant

AGNES GASKIN, Administrative Assistant

Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2017. Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps. Washington, DC: The National Academies Press. doi: 10.17226/24893.
×

COMMITTEE ON NATIONAL STATISTICS

ROBERT M. GROVES (Chair), Office of the Provost, Department of Mathematics and Statistics, and Department of Sociology, Georgetown University

FRANCINE BLAU, School of Industrial and Labor Relations, Cornell University

MARY ELLEN BOCK, Department of Statistics, Purdue University (emerita)

ANNE C. CASE, Woodrow Wilson School of Public and International Affairs, Princeton University

MICHAEL CHERNEW, Department of Health Care Policy, Harvard Medical School

JANET CURRIE, Woodrow Wilson School of Public and International Affairs, Princeton University

DONALD DILLMAN, Social and Economic Sciences Research Center, Washington State University

CONSTANTINE GATSONIS, Center for Statistical Sciences, Brown University

JAMES HOUSE, Survey Research Center, Institute for Social Research, University of Michigan

THOMAS MESENBOURG, Retired, formerly U.S. Census Bureau

SARAH NUSSER, Office of the Vice President for Research and Department of Statistics, Iowa State University

COLM O’MUIRCHEARTAIGH, Harris School of Public Policy Studies, University of Chicago

JEROME P. REITER, Department of Statistical Science, Duke University

ROBERTO RIGOBON, Sloan School of Management, Massachusetts Institute of Technology

JUDITH A. SELTZER, Department of Sociology, University of California, Los Angeles

EDWARD SHORTLIFFE, Department of Biomedical Informatics, Columbia University/Arizona State University

BRIAN A. HARRIS-KOJETIN, Director

CONSTANCE F. CITRO, Senior Scholar

Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2017. Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps. Washington, DC: The National Academies Press. doi: 10.17226/24893.
×

Acknowledgments

This report of the Panel on Improving Federal Statistics for Policy and Social Science Research Using Multiple Data Sources and State-of-the-Art Estimation Methods is the product of contributions from many colleagues, whom we thank for their generous sharing of their time and expertise.

The panel is grateful to the Laura and John Arnold Foundation for funding this study, and to foundation staff Stuart Buck and Meredith McPhail for their help and guidance throughout the study. The panel also is grateful for the supplemental funding provided by the National Academy of Sciences Kellogg Fund.

The panel thanks the many individuals who participated in the panel’s workshops and open meetings and shared their research, their challenges, and their creative approaches to using administrative and private-sector data sources. We also thank Steve Eglash (Stanford University) for his work examining issues of data access for private-sector companies.

At the National Academies of Sciences, Engineering, and Medicine, the panel would not have been able to complete its work efficiently without a capable staff. Constance F. Citro, former director of the Committee on National Statistics (CNSTAT), had the vision and perseverance to make this study a reality. The division’s Kirsten Sampson-Snyder was extremely helpful in coordinating the review process, and Eugenia Grohman provided meticulous and thorough editing that greatly improved the readability of the report. For CNSTAT, Agnes Gaskin, administrative assistant, provided assistance in managing the logistics of this panel and our meetings. Hermann Habermann, senior program officer, provided valuable feedback

Page viii Cite
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2017. Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps. Washington, DC: The National Academies Press. doi: 10.17226/24893.
×

and guidance on drafts of this report. George Schoeffel, research assistant, assisted with every aspect of the study, including creating and managing a database of references, creating figures and tables, researching and drafting items for the report, carefully reviewing drafts, and performing whatever tasks needed to be done for the panel and the report.

This Consensus Study Report was reviewed in draft form by individuals chosen for their diverse perspectives and technical expertise. The purpose of this independent review is to provide candid and critical comments that will assist the National Academies in making each published report as sound as possible and to ensure that it meets the institutional standards for quality, objectivity, evidence, and responsiveness to the study charge. The review comments and draft manuscript remain confidential to protect the integrity of the deliberative process.

We thank the following individuals for their review of this report: Cynthia Z.F. Clark, independent consultant, McLean, VA; Mick P. Couper, Institute for Social Research, University of Michigan; Jeremy Freese, Department of Sociology, Stanford University; Pamela Herd, Robert M. La Follette School of Public Affairs, University of Wisconsin–Madison; Thomas L. Mesenbourg, U.S. Census Bureau (retired); Stephen W. Raudenbush, Department of Sociology, University of Chicago; Jerome P. Reiter, Department of Statistical Science, Duke University; and Larry A. Wasserman, Department of Statistics and Machine Learning Department, Carnegie Mellon University.

Although the reviewers listed above provided many constructive comments and suggestions, they were not asked to endorse the report’s conclusions or recommendations, nor did they see the final draft of the report before its release. The review of this report was overseen by Michael Hout, Department of Sociology, New York University, and Alicia L. Carriquiry, Department of Statistics, Iowa State University. They were responsible for making certain that an independent examination of this report was carried out in accordance with the standards of the National Academies and that all review comments were carefully considered. Responsibility for the final content rests entirely with the authoring panel and the National Academies.

Robert M. Groves, Chair
Panel on Improving Federal Statistics for
Policy and Social Science Research Using Multiple Data Sources and
State-of-the-Art Estimation Methods
and Brian A. Harris-Kojetin, Study Director

Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2017. Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps. Washington, DC: The National Academies Press. doi: 10.17226/24893.
×

Preface

This is the second Consensus Study Report of the Panel on Improving Federal Statistics for Policy and Social Science Research Using Multiple Data Sources and State-of-the-Art Estimation Methods. Our first report, Innovations in Federal Statistics: Combining Data Sources While Protecting Privacy, was released in January 2017. In that report, the panel noted that there has been increasing attention in recent years to using data already collected by government entities for statistical purposes, such as evaluation of government programs. These data include such records as employment and earnings information on state unemployment insurance, income reported on federal tax forms, Social Security earnings and benefits, medical conditions and payments made for services from Medicare and Medicaid records, and food assistance program benefits.

We also noted that after the panel had begun its work, Congress had established an Evidence-Based Policymaking Commission (P.L. 114-140) and charged it with examining arrangements for integrating federal survey and administrative data and making those data available to researchers for program evaluation. The commission issued its final report on September 7, 2017, after the panel had completed its deliberations.

The commission’s focus was somewhat different from that of the panel. It addressed using statistical analysis to evaluate government programs and alternative policy options. The panel was more specifically focused on improvement in federal statistics through the use of multiple data sources. However, there was clearly overlap in the two activities.

Since the panel had completed its work when the commission’s report was released, we could not consider the similarities and differences between

Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2017. Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps. Washington, DC: The National Academies Press. doi: 10.17226/24893.
×

the commission’s recommendations and our own, so we leave that to the readers of the two reports. It is our hope that this report is useful to federal agencies and their stakeholders, as well as to the broader research community. It attempts to identify key challenges to sample surveys, which have long been the mainstay of federal statistics, and offer approaches to using the wealth of administrative and private-sector data that exist and that are being created every day.

Robert M. Groves, Chair
Panel on Improving Federal Statistics for
Policy and Social Science Research Using Multiple Data Sources and
State-of-the-Art Estimation Methods

Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2017. Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps. Washington, DC: The National Academies Press. doi: 10.17226/24893.
×
Page R1
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2017. Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps. Washington, DC: The National Academies Press. doi: 10.17226/24893.
×
Page R2
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2017. Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps. Washington, DC: The National Academies Press. doi: 10.17226/24893.
×
Page R3
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2017. Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps. Washington, DC: The National Academies Press. doi: 10.17226/24893.
×
Page R4
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2017. Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps. Washington, DC: The National Academies Press. doi: 10.17226/24893.
×
Page R5
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2017. Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps. Washington, DC: The National Academies Press. doi: 10.17226/24893.
×
Page R6
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2017. Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps. Washington, DC: The National Academies Press. doi: 10.17226/24893.
×
Page R7
Page viii Cite
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2017. Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps. Washington, DC: The National Academies Press. doi: 10.17226/24893.
×
Page R8
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2017. Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps. Washington, DC: The National Academies Press. doi: 10.17226/24893.
×
Page R9
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2017. Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps. Washington, DC: The National Academies Press. doi: 10.17226/24893.
×
Page R10
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2017. Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps. Washington, DC: The National Academies Press. doi: 10.17226/24893.
×
Page R11
Suggested Citation:"Front Matter." National Academies of Sciences, Engineering, and Medicine. 2017. Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps. Washington, DC: The National Academies Press. doi: 10.17226/24893.
×
Page R12
Next: Summary »
Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps Get This Book
×
 Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps
Buy Paperback | $58.00 Buy Ebook | $46.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

The environment for obtaining information and providing statistical data for policy makers and the public has changed significantly in the past decade, raising questions about the fundamental survey paradigm that underlies federal statistics. New data sources provide opportunities to develop a new paradigm that can improve timeliness, geographic or subpopulation detail, and statistical efficiency. It also has the potential to reduce the costs of producing federal statistics.

The panel's first report described federal statistical agencies' current paradigm, which relies heavily on sample surveys for producing national statistics, and challenges agencies are facing; the legal frameworks and mechanisms for protecting the privacy and confidentiality of statistical data and for providing researchers access to data, and challenges to those frameworks and mechanisms; and statistical agencies access to alternative sources of data. The panel recommended a new approach for federal statistical programs that would combine diverse data sources from government and private sector sources and the creation of a new entity that would provide the foundational elements needed for this new approach, including legal authority to access data and protect privacy.

This second of the panel's two reports builds on the analysis, conclusions, and recommendations in the first one. This report assesses alternative methods for implementing a new approach that would combine diverse data sources from government and private sector sources, including describing statistical models for combining data from multiple sources; examining statistical and computer science approaches that foster privacy protections; evaluating frameworks for assessing the quality and utility of alternative data sources; and various models for implementing the recommended new entity. Together, the two reports offer ideas and recommendations to help federal statistical agencies examine and evaluate data from alternative sources and then combine them as appropriate to provide the country with more timely, actionable, and useful information for policy makers, businesses, and individuals.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

    « Back Next »
  6. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  7. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  8. ×

    View our suggested citation for this chapter.

    « Back Next »
  9. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!