Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
PUTTING PEOPLE ON THE MAP PROTECTING CONFIDENTIALITY WITH LINKED SOCIAL-SPATIAL DATA Panel on Confidentiality Issues Arising from the Integration of Remotely Sensed and Self-Identifying Data Myron P. Gutmann and Paul C. Stern, Editors Committee on the Human Dimensions of Global Change Division of Behavioral and Social Sciences and Education THE NATIONAL ACADEMIES PRESS Washington, D.C. www.nap.edu
THE NATIONAL ACADEMIES PRESS â¢ 500 FIFTH STREET, N.W. â¢ Washington, DC 20001 NOTICE: The project that is the subject of this report was approved by the Governing Board of the National Research Council, whose members are drawn from the councils of the Na- tional Academy of Sciences, the National Academy of Engineering, and the Institute of Medi- cine. The members of the committee responsible for the report were chosen for their special competences and with regard for appropriate balance. This study was supported by Contract/Grant Nos. BCS-0431863, NNH04PR35P, and N01- OD-4-2139, TO 131 between the National Academy of Sciences and the U.S. National Science Foundation, the U.S. National Aeronautics and Space Administration, and the U.S. Depart- ment of Health and Human Services, respectively. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author(s) and do not necessar- ily reflect the views of the organizations or agencies that provided support for the project. Library of Congress Cataloging-in-Publication Data Putting people on the map : protecting confidentiality with linked social-spatial data / Panel on Confidentiality Issues Arising from the Integration of Remotely Sensed and Self-Identifying Data, Committee on the Human Dimensions of Global Change, Division of Behavioral and Social Sciences and Education. p. cm. âNational Research Council.â Includes bibliographical references. ISBN 978-0-309-10414-2 (pbk.) â ISBN 978-0-309-66831-6 (pdf) 1. Social sciencesâ ResearchâMoral and ethical aspects. 2. Confidential communicationsâSocial surveys. 3. Spatial analysis (Statistics) 4. Privacy, Right ofâUnited States. 5. Public recordsâAccess controlâUnited States. I. National Research Council (U.S.). Panel on Confidentiality Issues Arising from the Integration of Remotely Sensed and Self-Identifying Data. II. Title: Protect- ing confidentiality with linked social-spatial data. H62.P953 2007 174'.93âdc22 2006103005 Additional copies of this report are available from the National Academies Press, 500 Fifth Street, N.W., Lockbox 285, Washington, DC 20055; (800) 624-6242 or (202) 334-3313 (in the Washington metropolitan area); Internet http://www.nap.edu. Printed in the United States of America. Cover image: Tallinn, the capital city and main seaport of Estonia, is located on Estoniaâs north coast to the Gulf of Finland. Acquired on June 18, 2006, this scene covers an area of 35.6 Ã 37.5 km and is located at 59.5 degrees north latitude and 25 degrees east longitude. The red dots are arbitrarily selected and do not correspond to the locations of actual research participants. Cover credit: NASA/GSFC/METI/ERSDAC/JAROS and U.S./Japan ASTER Science Team. Suggested citation: National Research Council. (2007). Putting People on the Map: Protect- ing Confidentiality with Linked Social-Spatial Data. Panel on Confidentiality Issues Arising from the Integration of Remotely Sensed and Self-Identifying Data. M.P. Gutmann and P.C. Stern, Eds. Committee on the Human Dimensions of Global Change. Division of Behavioral and Social Sciences and Education. Washington, DC: The National Academies Press.
The National Academy of Sciences is a private, nonprofit, self-perpetuating society of distinguished scholars engaged in scientific and engineering research, dedicated to the furtherance of science and technology and to their use for the general welfare. Upon the authority of the charter granted to it by the Congress in 1863, the Acad- emy has a mandate that requires it to advise the federal government on scientific and technical matters. Dr. Ralph J. Cicerone is president of the National Academy of Sciences. The National Academy of Engineering was established in 1964, under the charter of the National Academy of Sciences, as a parallel organization of outstanding engi- neers. It is autonomous in its administration and in the selection of its members, sharing with the National Academy of Sciences the responsibility for advising the federal government. The National Academy of Engineering also sponsors engineer- ing programs aimed at meeting national needs, encourages education and research, and recognizes the superior achievements of engineers. Dr. Wm. A. Wulf is presi- dent of the National Academy of Engineering. The Institute of Medicine was established in 1970 by the National Academy of Sciences to secure the services of eminent members of appropriate professions in the examination of policy matters pertaining to the health of the public. The Institute acts under the responsibility given to the National Academy of Sciences by its congressional charter to be an adviser to the federal government and, upon its own initiative, to identify issues of medical care, research, and education. Dr. Harvey V. Fineberg is president of the Institute of Medicine. The National Research Council was organized by the National Academy of Sciences in 1916 to associate the broad community of science and technology with the Academyâs purposes of furthering knowledge and advising the federal government. Functioning in accordance with general policies determined by the Academy, the Council has become the principal operating agency of both the National Academy of Sciences and the National Academy of Engineering in providing services to the government, the public, and the scientific and engineering communities. The Coun- cil is administered jointly by both Academies and the Institute of Medicine. Dr. Ralph J. Cicerone and Dr. Wm. A. Wulf are chair and vice chair, respectively, of the National Research Council. www.national-academies.org
PANEL ON CONFIDENTIALITY ISSUES ARISING FROM THE INTEGRATION OF REMOTELY SENSED AND SELF-IDENTIFYING DATA MYRON P. GUTMANN, Chair, Inter-university Consortium for Political and Social Research, University of Michigan, Ann Arbor MARC P. ARMSTRONG, Department of Geography, University of Iowa DEBORAH BALK, School of Public Affairs, Baruch College, City University of New York KATHLEEN OâNEILL GREEN, Alta Vista Company, Berkeley, CA FELICE J. LEVINE, American Educational Research Association, Washington, DC HARLAN J. ONSRUD, Department of Spatial Information Science and Engineering, University of Maine JEROME P. REITER, Institute of Statistics and Decision Science, Duke University RONALD R. RINDFUSS, Department of Sociology and the Carolina Population Center, University of North Carolina at Chapel Hill PAUL C. STERN, Study Director LINDA DEPUGH, Administrative Assistant v
Preface The main themes of this reportâprotecting the confidentiality of hu- man research subjects in social science research and simultaneously ensur- ing that research data are used as widely and as frequently as possibleâ have been the subject of a number of National Research Council (NRC) publications over a considerable span of time. Beginning with Sharing Re- search Data (1985) and continuing with Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics (1993), Protect- ing Participants and Facilitating Behavioral and Social Science Research (2003), and, most recently, Expanding Access to Research Data: Reconcil- ing Risks and Opportunities (2005), a series of reports has emphasized the value of expanded sharing and use of social science data while simulta- neously protecting the interests (and especially the confidentiality) of hu- man research subjects. This report draws from those earlier evaluations and analyzes the role played by a type of data infrequently discussed in those publications: data that explicitly identify a location associated with a research subjectâhome, work, school, doctorâs office, or somewhere else. The increased availability of spatial information, the increasing knowl- edge of how to perform sophisticated scientific analyses using it, and the growth of a body of science that makes use of these data and analyses to study important social, economic, environmental, spatial, and public health problems has led to an increase in the collection and preservation of these data and in the linkage of spatial and nonspatial information about the same research subjects. At the same time, questions have been raised about the best ways to increase the use of such data while preserving respondent vii
viii PREFACE confidentiality. The latter is important because analyses that make the most productive use of spatial information often require great accuracy and precision in that information: for example, if you want to know the route someone takes from home to the doctorâs office, imprecision in one or the other degrades the analysis. Yet precise information about spatial location is almost perfectly identifying: if one knows where someone lives, one is likely to know the personâs identity. That tension between the need for precision and the need to protect the confidentiality of research subjects is what motivates this study. In this report, the Panel on Confidentiality Issues Arising from the Integration of Remotely Sensed and Self-Identifying Data recommends ways to find a successful balance between needs for precision and the protection of confidentiality. It considers both institutional and technical solutions and draws conclusions about each. In general, we find that institutional solutions are the most promising for the short term, though they need further development, while technical solutions have promise in the longer term and require further research. As the report explains, the members of the panel chose in one signifi- cant way to broaden their mandate beyond the explicit target of âremotely sensed and self-identifyingâ data because working within the limitation of remotely sensed data restricted the problem domain in a way at odds with the world. From the perspective of confidentiality protection, when social science research data are linked with spatial information, it does not matter whether the geospatial locations are derived from remotely sensed imagery or from other means of determining location (GPS devices, for example). The issues raised by linking remotely sensed information are a special case within the larger category of spatially precise and accurate information. For that reason, the study considers all forms of spatial information as part of its mandate. In framing the response to its charge, the panel drew heavily on existing reports, on published material, and on best practices in the field. The panel also commissioned papers and reports from experts; they were presented at a workshop held in December 2005 at the National Academies. Two of the papers are included as appendixes to this report. Biographical sketches of panel members and staff are also included at the end of this report. This report could not have been completed successfully without the hard work of members of the NRC staff. Paul Stern served as study director for the panel and brought his usual skills in planning, organization, consen- sus building, and writing. Moreover, from a panel chairâs perspective, he is a superb partner and collaborator. We also thank the members of the Committee on the Human Dimensions of Global Change, under whose auspices the panel was constituted, for their support. The panel members and I also thank the participants in the Workshop
ix PREFACE on Confidentiality Issues in Linking Geographically Explicit and Self-Identifying Data. Their papers and presentations provided the mem- bers of the panel with a valuable body of information and interpretations, which contributed substantially to our formulation of both problems and solutions. Rebecca Clark of the Demographic and Behavioral Sciences Branch of the National Institute of Child Health and Human Development has been a tireless supporter of many of the intellectual issues addressed by this study, both those that encourage the sharing of data and those that encourage the protection of confidentiality; and it was in good part her energy that led to the studyâs initiation. We gratefully acknowledge her efforts and the finan- cial support of the National Institute of Child Health and Human Develop- ment, a part of the National Institutes of Health of the Department of Health and Human Services; the National Science Foundation; and the National Aeronautics and Space Administration. Finally, I thank the members of the panel for their hard work and active engagement in the process of preparing this report. They are a lively group with a wide diversity of backgrounds and approaches to the use of spatial and social science data, who all brought a genuine concern for enhancing research, sharing data, and protecting confidentiality to the task that con- fronted us. National Research Council panels are expected to be interdisci- plinary: thatâs the goal of constituting them to prepare reports such as this one. This particular panel was made up of individuals who were themselves interdisciplinary, and the breadth of their individual and group expertise made the process of completing the report especially rewarding. The panelâs discussions aimed to find balance and consensus among these diverse indi- viduals and their diverse perspectives. Writing the report was a group effort to which everyone contributed. Iâm grateful for the hard work. This report has been reviewed in draft form by individuals chosen for their diverse perspectives and technical expertise, in accordance with proce- dures approved by the Report Review Committee of the National Research Council. The purpose of this independent review is to provide candid and critical comments that assist the institution in making the published report as sound as possible and ensure that the report meets institutional stan- dards for objectivity, evidence, and responsiveness to the study charge. The review comments and draft manuscript remain confidential to protect the integrity of the deliberative process. We thank the following individuals for their participation in the review of the report: Joe S. Cecil, Division of Research, Federal Judicial Center, Washington, DC; Lawrence H. Cox, Research and Methodology, National Center for Health Statistics, Centers for Disease Control and Prevention, Hyattsville, MD; Glenn D. Deane, Department of Sociology, University at Albany; Jerome E. Dobson, Department of Geography, University of Kan-
x PREFACE sas; George T. Duncan, Heinz School of Public Policy and Management, Carnegie Mellon University; Lawrence Gostin, Research and Academic Programs, Georgetown University Law Center, Washington, DC; Joseph C. Kvedar, Directorâs Office, Partners Telemedicine, Boston, MA; W. Christopher Lenhardt, Socioeconomic Data and Applications Center, Co- lumbia University, Palisades, NY; Jean-Bernard Minster, Scripps Institution of Oceanography, University of California, La Jolla, CA; and Gerard Rushton, Department of Geography, The University of Iowa. Although the reviewers listed above provided many constructive com- ments and suggestions, they were not asked to endorse the conclusions or recommendations nor did they see the final draft of the report before its release. The review of this report was overseen by Richard Kulka, Abt Associates, Durham, NC. Appointed by the National Research Council, he was responsible for making certain that an independent examination of this report was carried out in accordance with institutional procedures and that all review comments were carefully considered. Responsibility for the final content of this report rests entirely with the authoring panel and the institutions. Myron P. Gutmann, Chair Panel on Confidentiality Issues Arising from the Integration of Remotely Sensed and Self-Identifying Data
Contents Executive Summary 1 1 Linked Social-Spatial Data: Promises and Challenges 7 2 Legal, Ethical, and Statistical Issues in Protecting Confidentiality 26 3 Meeting the Challenges 42 4 The Tradeoff: Confidentiality Versus Access 59 References 71 Appendixes A Privacy for Research Data 81 Robert Gellman B Ethical Issues Related to Linked Social-Spatial Data 123 Felice J. Levine and Joan E. Sieber Biographical Sketches for Panel Members and Staff 160 xi