Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Executive Summary Precise, accurate spatial data are contributing to a revolution in some fields of social science. Improved access to such data about individuals, groups, and organizations makes it possible for researchers to examine questions they could not otherwise explore, gain better understanding of human behavior in its physical and environmental contexts, and create benefits for society from the knowledge flows from new types of scientific research. However, to the extent that data are spatially precise, there is a corresponding increase in the risk of identification of the people or organi- zations to which the data apply. With identification comes a risk of various kinds of harm to those identified and the compromise of promises of confi- dentiality made to gain access to the data. This report focuses on the opportunities and challenges that arise when accurate and precise spatial data on research participants, such as the loca- tions of their homes or workplaces, are linked to personal information they have provided under promises of confidentiality. The availability of these data makes it possible to do valuable new kinds of research that links information about the external environment to the behavior and values of individuals. Among many possible examples, such research can explore how decisions about health care are made, how young people develop healthy lifestyles, and how resource-dependent families in poorer countries spend their time obtaining the energy and food that they need to survive. The linkage of spatial and social information, like the growing linkage of socioeconomic characteristics with biomarkers (biological data on indi- 1
2 PUTTING PEOPLE ON THE MAP viduals), has the potential to revolutionize social science and to significantly advance policy making. While the availability of linked social-spatial data has great promise for research, the locational information makes it possible for a secondary user of the linked data to identify the participant and thus break the promise of confidentiality made when the social data were collected. Such a user could also discover additional information about the research participant, without asking for it, by linking to geographically coded information from other sources. Open public access to linked social and high-resolution spatial data greatly increases the risk of breaches of confidentiality. At the same time, highly restrictive forms of data management and dissemination carry very high costs: by making it prohibitively difficult for researchers to gain access to data or by restricting or altering the data so much that they are no longer useful for answering many types of important scientific questions. CONCLUSIONS CONCLUSION 1: Recent advances in the availability of social-spatial data and the development of geographic information systems (GIS) and related techniques to manage and analyze those data give researchers important new ways to study important social, environmental, eco- nomic, and health policy issues and are worth further development. CONCLUSION 2: The increasing use of linked social-spatial data has created significant uncertainties about the ability to protect the confi- dentiality promised to research participants. Knowledge is as yet inad- equate concerning the conditions under which and the extent to which the availability of spatially explicit data about participants increases the risk of confidentiality breaches. Various new technical procedures involving transforming data or creat- ing synthetic datasets show promise for limiting the risk of identification while providing broader access and maintaining most of the scientific value of the data. However, these procedures have not been sufficiently studied to realistically determine their usefulness. CONCLUSION 3: Recent research on technical approaches for reduc- ing the risk of identification and breach of confidentiality has demon- strated promise for future success. At this time, however, no known technical strategy or combination of technical strategies for managing linked spatial-social data adequately resolves conflicts among the ob- jectives of data linkage, open access, data quality, and confidentiality protection across datasets and data uses.
3 EXECUTIVE SUMMARY CONCLUSION 4: Because technical strategies will be not be sufficient in the foreseeable future for resolving the conflicting demands for data access, data quality, and confidentiality, institutional approaches will be required to balance those demands. Institutional solutions involve establishing tiers of risk and access and developing data-sharing protocols that match the level of access to the risks and benefits of the planned research. Such protocols will require that the authority to decide about data access be allocated appropriately among primary researchers, data stewards, data users, institutional review boards (IRBs), and research sponsors and that those actors are very well informed about the benefits and risks of the data access policies they may be asked to approve. We generally endorse the recommendations of the 2004 National Re- search Council report, Protecting Participants and Facilitating Social and Behavioral Sciences Research, and the 2005 report, Expanding Access to Research Data: Reconciling Risks and Opportunities, regarding restricted access to confidential data and unrestricted access to public-use data that have been modified so as to protect confidentiality, expanded data access (remotely and through licensing agreements), increased research on ways to address the competing claims of access and confidentiality, and related matters. Those reports, however, have not dealt in detail with the risks and tradeoffs that arise with data that link the information in social science research with spatial locations. Consequently, we offer eight recommenda- tions to address those data. RECOMMENDATIONS Recommendation 1: Technical and Institutional Research Federal agencies and other organizations that sponsor the collection and analysis of linked social-spatial dataâor that support data that could provide added benefits with such linkageâshould sponsor re- search into techniques and procedures for disseminating such data while protecting confidentiality and maintaining the usefulness of the data for social-spatial analysis. This research should include studies to adapt existing techniques from other fields, to understand how the publica- tion of linked social-spatial data might increase disclosure risk, and to explore institutional mechanisms for disseminating linked data while protecting confidentiality and maintaining the usefulness of the data.
4 PUTTING PEOPLE ON THE MAP Recommendation 2: Education and Training Faculty, researchers, and organizations involved in the continuing pro- fessional development of researchers should engage in the education of researchers in the ethical use of spatial data. Professional associations should participate by establishing and inculcating strong norms for the ethical use and sharing of linked social-spatial data. Recommendation 3: Training in Ethical Issues Training in ethical considerations needs to accompany all method- ological training in the acquisition and use of data that include geo- graphically explicit information on research participants. Recommendation 4: Outreach by Professional Societies and Other Or- ganizations Research societies and other research organizations that use linked social-spatial data and that have established traditions of protection of the confidentiality of human research participants should engage in outreach to other research societies and organizations less conversant in research with issues of human participant protection to increase attention to these issues in the context of the use of personal, identifi- able data. Recommendation 5: Research Design Primary researchers who intend to collect and use spatially explicit data should design their studies in ways that not only take into account the obligation to share data and the disclosure risks posed, but also provide confidentiality protection for human participants in the primary re- search as well as in secondary research use of the data. Although the reconciliation of these objectives is difficult, primary researchers should nevertheless assume a significant part of this burden. Recommendation 6: Institutional Review Boards Institutional Review Boards and their organizational sponsors should develop the expertise needed to make well-informed decisions that bal- ance the objectives of data access, confidentiality, and quality in re- search projects that will collect or analyze linked social-spatial data.
5 EXECUTIVE SUMMARY Recommendation 7: Data Enclaves Data enclaves deserve further development as a way to provide wider access to high-quality data while preserving confidentiality. This devel- opment should focus on the establishment of expanded place-based enclaves, âvirtual enclaves,â and meaningful penalties for misuse of enclaved data. Recommendation 8: Licensing Data stewards should develop licensing agreements to provide increased access to linked social-spatial datasets that include confidential infor- mation. The promise of gaining important scientific knowledge through the availability of linked social-spatial data can only be fulfilled with careful attention by primary researchers, data stewards, data users, IRBs, and re- search sponsors to balancing the needs for data access, data quality, and confidentiality. Until technical solutions are available, that balancing must come through institutional mechanisms.