NATIONAL ACADEMY PRESS
2101 Constitution Avenue, N.W.Washington, D.C.20418
NOTICE: The project that is the subject of this report was approved by the Governing Board of the National Research Council, whose members are drawn from the councils of the National Academy of Sciences, the National Academy of Engineering, and the Institute of Medicine. The members of the committee responsible for the report were chosen for their special competences and with regard for appropriate balance.
This study was supported by Contract/Grant No. SBR-9709489 between the National Academy of Sciences and the National Science Foundation. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the views of the organizations or agencies that provided support for the project.
International Standard Book Number 0-309-07180-1
Additional copies of this report are available from
National Academy Press
, 2101 Constitution Avenue, N.W.,Lockbox 285,Washington, D.C.20055;(800) 624-6242 or(202) 334-3313 (in the Washington metropolitan area); Internet, http://www.nap.edu
Printed in the United States of America.
Copyright 2000 by the National Academy of Sciences. All rights reserved.
Suggested citation: National Research Council (2000). Improving Access to and Confidentiality of Research Data: Report of a Workshop. Committee on National Statistics, Christopher Mackie and Norman Bradburn, Eds. Commission on Behavioral and Social Sciences and Education. Washington, D.C.: National Academy Press.
THE NATIONAL ACADEMIES
National Academy of Sciences
National Academy of Engineering
Institute of Medicine
National Research Council
The National Academy of Sciences is a private, nonprofit, self-perpetuating society of distinguished scholars engaged in scientific and engineering research, dedicated to the furtherance of science and technology and to their use for the general welfare. Upon the authority of the charter granted to it by the Congress in 1863, the Academy has a mandate that requires it to advise the federal government on scientific and technical matters. Dr. Bruce M. Alberts is president of the National Academy of Sciences.
The National Academy of Engineering was established in 1964, under the charter of the National Academy of Sciences, as a parallel organization of outstanding engineers. It is autonomous in its administration and in the selection of its members, sharing with the National Academy of Sciences the responsibility for advising the federal government. The National Academy of Engineering also sponsors engineering programs aimed at meeting national needs, encourages education and research, and recognizes the superior achievements of engineers. Dr. William A. Wulf is president of the National Academy of Engineering.
The Institute of Medicine was established in 1970 by the National Academy of Sciences to secure the services of eminent members of appropriate professions in the examination of policy matters pertaining to the health of the public. The Institute acts under the responsibility given to the National Academy of Sciences by its congressional charter to be an adviser to the federal government and, upon its own initiative, to identify issues of medical care, research, and education. Dr. Kenneth I. Shine is president of the Institute of Medicine.
The National Research Council was organized by the National Academy of Sciences in 1916 to associate the broad community of science and technology with the Academy's purposes of furthering knowledge and advising the federal government. Functioning in accordance with general policies determined by the Academy, the Council has become the principal operating agency of both the National Academy of Sciences and the National Academy of Engineering in providing services to the government, the public, and the scientific and engineering communities. The Council is administered jointly by both Academies and the Institute of Medicine. Dr. Bruce M. Alberts and Dr. William A. Wulf are chairman and vice chairman, respectively, of the National Research Council.
COMMITTEE ON NATIONAL STATISTICS 1999-2000
JOHN E. ROLPH
Marshall School of Business, University of Southern California
JOSEPH G. ALTONJI,
Department of Economics, Northwestern University
LAWRENCE D. BROWN,
Department of Statistics, University of Pennsylvania
JULIE DAVANZO, RAND,
Santa Monica, California
WILLIAM F. EDDY,
Department of Statistics, Carnegie Mellon University
Statistics Division, United Nations, New York
WILLIAM D. KALSBEEK,
Survey Research Unit, Department of Biostatistics, University of North Carolina
RODERICK J.A. LITTLE,
School of Public Health, University of Michigan
THOMAS A. LOUIS,
Division of Biostatistics, University of Minnesota
CHARLES F. MANSKI,
Department of Economics, Northwestern University
EDWARD B. PERRIN,
Department of Health Services, University of Washington
FRANCISCO J. SAMANIEGO,
Division of Statistics, University of California, Davis
RICHARD L. SCHMALENSEE,
Sloan School of Management, Massachusetts Institute of Technology
MATTHEW D. SHAPIRO,
Department of Economics, University of Michigan
ANDREW A. WHITE,
The Committee on National Statistics (CNSTAT) appreciates the time, effort, and valuable input of the many people who contributed to the workshop on confidentiality of and access to research data and to the preparation of this report. We would first like to thank those who made presentations, which, along with the background papers prepared for the workshop, helped identify many of the key issues in this area. The comments made by attendees contributed to a broad-ranging exchange of ideas that is captured in this summary report. We are also thankful for the additional input provided by participants on early report drafts. Thanks are due especially to Norman Bradburn, former CNSTAT member, who as workshop chair provided valuable advice during the planning stages and the leadership necessary for conducting a successful workshop. The agenda for the workshop was developed in consultation with Richard Suzman, director of behavioral and social research at the National Institute on Aging, whose input was essential in identifying workshop objectives.
Particular appreciation is due to those who worked to organize the workshop and prepare this report. Christopher Mackie served as study director for the workshop. He led the planning of the workshop, worked to ensure its successful conduct, prepared the report drafts, and revised the report in response to comments from reviewers and workshop participants. Tom Jabine, who served as consultant to the committee, offered valuable guidance on the agenda, the selection of appropriate presenters, and the preparation of this report. He and Heather Koball, research associate in CNSTAT, contributed significantly to our work with a paper on the practices of organizations that
distribute public-use data, which was presented at the workshop and is summarized in this report. The extra time, guidance, and input of the CNSTAT subcommittee for the workshop—Thomas Louis, Roderick Little, and Charles Manski—further enhanced the workshop's outcome and this report. Miron Straf, former CNSTAT director, was responsible for early project development and workshop planning. CNSTAT staff members Terri Scanlan and Jamie Casey were responsible for all of the details involved in organizing the workshop and preparing this report. Rona Briere edited the final draft. Eugenia Grohman, associate director for reports in the Commission on Behavioral and Social Sciences and Education, guided the report through the review process, final editing, and publication.
This report has been reviewed in draft form by individuals chosen for their diverse perspectives and technical expertise, in accordance with procedures approved by the Report Review Committee of the National Research Council. The purpose of this independent review is to provide candid and critical comments that will assist the institution in making the published report as sound as possible and to ensure that the report meets institutional standards for objectivity, evidence, and responsiveness to the study charge. The review comments and draft manuscript remain confidential to protect the integrity of the deliberative process.
We thank the following individuals for their participation in the review of this report: William F. Eddy, Department of Statistics, Carnegie Mellon University; Amitai Etzioni, Department of Sociology, George Washington University; Jack Feldman, Center for Health Affairs/Project HOPE, Bethesda, MD; Olivia Mitchell, Wharton School, University of Pennsylvania; and Richard Rockwell, Inter-university Consortium for Political and Social Research, University of Michigan.
Although the individuals listed above provided constructive comments and suggestions, it must be emphasized that responsibility for the final content of this report rests entirely with the authoring committee and the institution.
Norman Bradburn, Workshop Chair
The workshop summarized in this report was convened by the Committee on National Statistics (CNSTAT) to promote discussion about methods for advancing the often conflicting goals of exploiting the research potential of microdata and maintaining acceptable levels of confidentiality. The primary sponsor of the workshop was the National Institute on Aging (NIA), but additional funding was received from the Agency for Health Care Policy and Research; the Bureau of Labor Statistics; the National Library of Medicine; the Office of Research and Statistics, Social Security Administration; and the Office of the Assistant Secretary for Planning and Evaluation, U.S. Department of Health and Human Services. Sponsors voiced a common desire to develop research programs aimed at quantitatively assessing the risks of reidentification in surveys linked to administrative data.
Sponsors also stressed the importance of demonstrating and weighing the value of linked data to research and policy. Prior to the CNSTAT workshop, NIA funded a preworkshop conference, organized through the University of Michigan, to illustrate this value—particularly as it applies to research on aging issues. The workshop was designed to advance the dialogue necessary for federal agencies to make sound decisions about how and to whom to release data, and in what cases to allow linkage to administrative records. Sponsors were interested in improving communication among communities with divergent interests, as well as the decision-making frameworks for guiding data release procedures.
This report outlines essential themes of the access versus confidentiality debate that emerged during the workshop. Among these themes are the
tradeoffs and tensions between the needs of researchers and other data users on the one hand and confidentiality requirements on the other; the relative advantages and costs of data perturbation techniques (applied to facilitate public release) versus restricted access as tools for improving security; and the need to quantify disclosure risks—both absolute and relative—created by researchers and research data, as well as by other data users and other types of data.
The workshop was not designed to produce policy recommendations. However, this report does summarize areas of discussion in which common ground among some participants emerged. For example, a subset of participants endorsed the idea that both access and confidentiality can benefit from (1) more coordination among agencies regarding data release procedures and creation of data access outlets, (2) increased communication between data producers and users, (3) improved quantification of the disclosure risks and research benefits associated with different types of data release, and (4) stricter enforcement of laws designed to ensure proper use of restricted access data.
Finally, the report anticipates the direction of future CNSTAT projects. Future work will likely address evolving statistical techniques for manipulating data in ways that preserve important statistical properties and allow for broader general data release; new, less burdensome ways of providing researchers with access to restricted data sets; and the role of licensing coupled with graduated civil and criminal penalties for infringement.
Norman Bradburn, Workshop Chair