Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Committee on Toward an Open Science Enterprise Board on Research Data and Information Policy and Global Affairs A Consensus Study Report of
THE NATIONAL ACADEMIES PRESS 500 Fifth Street, NW Washington, DC 20001 This activity was supported by the Laura and John Arnold Foundation. Any opin- ions, findings, conclusions, or recommendations expressed in this publication do not necessarily reflect the views of any organization or agency that provided sup- port for the project. International Standard Book Number-13: 978-0-309-47624-9 International Standard Book Number-10: 0-309-47624-0 Library of Congress Control Number: 2018950760 Digital Object Identifier: https://doi.org/10.17226/25116 Additional copies of this publication are available for sale from the National Academies Press, 500 Fifth Street, NW, Keck 360, Washington, DC 20001; (800) 624-6242 or (202) 334-3313; http://www.nap.edu. Copyright 2018 by the National Academy of Sciences. All rights reserved. Printed in the United States of America Suggested citation: National Academies of Sciences, Engineering, and Medicine. 2018. Open Science by Design: Realizing a Vision for 21st Century Research. Washington, DC: The National Academies Press. doi: https://doi.org/10.17226/ 25116.
The National Academy of Sciences was established in 1863 by an Act of Congress, signed by President Lincoln, as a private, nongov- ernmental institution to advise the nation on issues related to sci- ence and technology. Members are elected by their peers for out- standing contributions to research. Dr. Marcia McNutt is president. The National Academy of Engineering was established in 1964 under the charter of the National Academy of Sciences to bring the practices of engineering to advising the nation. Members are elected by their peers for extraordinary contributions to engineer- ing. Dr. C. D. Mote, Jr., is president. The National Academy of Medicine (formerly the Institute of Medicine) was established in 1970 under the charter of the Na- tional Academy of Sciences to advise the nation on medical and health issues. Members are elected by their peers for distin- guished contributions to medicine and health. Dr. Victor J. Dzau is president. The three Academies work together as the National Academies of Sciences, Engineering, and Medicine to provide independent, objective analysis and advice to the nation and conduct other ac- tivities to solve complex problems and inform public policy deci- sions. The National Academies also encourage education and re- search, recognize outstanding contributions to knowledge, and increase public understanding in matters of science, engineering, and medicine. Learn more about the National Academies of Sciences, Engineer- ing, and Medicine at www.nationalacademies.org.
Consensus Study Reports published by the National Academies of Sciences, Engineering, and Medicine document the evidence- based consensus on the studyâs statement of task by an authoring committee of experts. Reports typically include findings, conclu- sions, and recommendations based on information gathered by the committee and the committeeâs deliberations. Each report has been subjected to a rigorous and independent peer-review pro- cess and it represents the position of the National Academies on the statement of task. Proceedings published by the National Academies of Sciences, Engineering, and Medicine chronicle the presentations and discus- sions at a workshop, symposium, or other event convened by the National Academies. The statements and opinions contained in proceedings are those of the participants and are not endorsed by other participants, the planning committee, or the National Acad- emies. For information about other products and activities of the National Academies, please visit www.nationalacademies.org/about/what- wedo.
COMMITTEE ON TOWARD AN OPEN SCIENCE ENTERPRISE Alexa T. McCray (NAM) (Chair), Professor of Medicine, Harvard Medical School Francine Berman, Edward P. Hamilton Distinguished Professor of Computer Science, Rensselaer Polytechnic Institute Michael Carroll, Professor of Law, American University Washington College of Law Donna Ginther, Professor, Department of Economics; Director, Center for Science, Technology and Economic Policy, University of Kansas Robert Miller, Chief Executive Officer, LYRASIS Peter Schiffer, Vice Provost for Research and Professor in Applied Physics, Yale University Edward Seidel, Vice President for Economic Development and Innovation, University of Illinois System; Founder Professor of Physics, Professor of Astronomy and Computer Science, University of Illinois at Urbana-Champaign Alex Szalay, Bloomberg Distinguished Professor of Astronomy, The Johns Hopkins University Lisa Tauxe (NAS), Distinguished Professor of Geophysics, Scripps Institution of Oceanography, University of California, San Diego Heng Xu, Associate Professor of Information Sciences and Technology, College of Information Sciences and Technology, The Pennsylvania State University Principal Project Staff Tom Arrison, Program Director, Policy and Global Affairs Division (from November 2017) Emi Kameyama, Associate Program Officer, Board on Research Data and Information George Strawn, Director, Board on Research Data and Information Ester Sztein, Deputy Director, Board on Research Data and Information Nicole Lehmer, Senior Program Assistant, Board on Research Data and Information Alan Anderson, Consultant Christine Liu, Senior Program Officer (until October 2017) v
BOARD ON RESEARCH DATA AND INFORMATION Alexa McCray (NAM) (Chair), Professor of Medicine, Harvard Medical School Amy Brand, Director, MIT Press Kelvin Droegemeier, Vice President for Research, University of Oklahoma Stuart Feldman, Chief Scientist, Schmidt Futures Salman Habib, Senior Physicist and Computational Scientist, Argonne National Laboratory James Hendler, Director, Institute for Data Exploration and Applications, Rensselaer Polytechnic Institute Elliot Maxwell, Chief Executive Officer, e-Maxwell & Associates Barend Mons, Chair in Biosemantics, Leiden University Medical Center Sarah Nusser, Vice President for Research, Iowa State University Michael Stebbins, President, Science Advisors, LLC Bonnie Carroll, * Chairman and Chief Executive Officer, Information International Associates (CODATA Secretary General) John Hildebrand (NAS),* Regents Professor of Neuroscience, University of Arizona (NAS Foreign Secretary) Paul Uhlir,* Consultant, Data Policy and Management (CODATA Executive Committee Member) Staff George Strawn, Director Tom Arrison, Program Director Ester Sztein, Deputy Director Emi Kameyama, Associate Program Officer Nicole Lehmer, Senior Program Assistant *Denotes ex-officio member. vi
Preface Just as society has been transformed by the digital revolution, so, too, have many aspects of the scientific enterprise. Publicly available data in federally spon- sored databases serve as starting points for many research investigations. Collab- orations are no longer hampered by geographic distance, and, in some cases, the majority or even all of the work is conducted by sharing digital research files, corresponding by email, and meeting virtually, with time zone differences being the only deterrent to the frequency of the meetings. Data are largely collected, stored, manipulated, and shared in electronic form. Research papers are prepared using word-processing software and are often formatted and submitted in camera- ready form to the publisher. The majority of published articles are no longer bound in print journals and disseminated by conventional postal delivery, but ra- ther are available through the publisherâs database, most often mediated by con- tracts with institutional libraries. This transformation has had economic, policy, and practical implications, many of which are still in the process of being fully addressed and resolved. An increasing number of scientists have begun to question the closed world of scien- tific publishing and have suggested that the results of their research should be openly available for all, to benefit not only fellow scientists, but also the general citizenry. Indeed, the pursuit of âcitizen scienceâ is now recognized as a valid and useful activity. Faculty at many universities have adopted university-wide âopen accessâ policies that ensure that, at a minimum, their research papers are available through their institutionâs repository. New publishing venues have arisen, including open access journals, some of very high-quality and others not. Individual researchers, while interested in having their work broadly read and cited, are faced with competing pressures, including publishing in journals with high âimpact factors,â such that they are in the best possible position for promotion and tenure. Research funders have seen the value of openly sharing the results of the research that they have supported, not just in the form of publications, but also in the form of the data that have been produced in the course of the investigation. They have begun to require that applicants prepare data management plans as part of their grant proposals. A number of legal and policy developments have facilitated broader access to scientific research. Recognizing the potential of the Internet to broadly and eq- uitably disseminate scientific knowledge, a collaborative effort has created a legal framework that is consistent with U.S. copyright law, and that provides guidance to researchers who would like to have greater control over how their research re- sults are used and disseminated. Several federal policies require that publicly vii
viii Preface funded research results, in the form of data and publications, be deposited in pub- lic access repositories. Legislation is now also pending in Congress that would strengthen these policies. To evaluate more fully the benefits and challenges of broadening access to the results of scientific research, described as âopen science,â the National Acad- emies of Sciences, Engineering, and Medicine appointed an expert committee in March 2017. Brief biographies of the individual committee members are provided in Appendix A. The committee was charged with focusing on how to move toward open science as the default for scientific research results, and to indicate both the benefits of moving toward open science and the barriers to doing so. This report presents the findings and recommendations of the committee, with the majority of the focus on solutions that move the research enterprise toward open science. This Consensus Study Report was reviewed in draft form by individuals chosen for their diverse perspectives and technical expertise. The purpose of this independent review is to provide candid and critical comments that will assist the National Academies of Sciences, Engineering, and Medicine in making each pub- lished report as sound as possible and to ensure that it meets the institutional stand- ards for quality, objectivity, evidence, and responsiveness to the study charge. The review comments and draft manuscript remain confidential to protect the integrity of the deliberative process. We thank the following individuals for their review of this report: Prudence Adler, Association of Research Libraries; David Allison, Indiana University, Bloomington; Geoffrey Boulton, University of Edinburgh; Anita de Waard, Elsevier; Michael Forster, Institute of Electrical and Electronics Engineers; Laura Greene, Florida State University; Heather Joseph, Scholarly Publishing and Aca- demic Resources Coalition; VÃ©ronique Kiermer, Public Library of Science; Michael Lesk, Rutgers University; William Mobley, University of California, San Diego; Mark Musen, Stanford University; Sarah Nusser, Iowa State University; and George Schatz, Journal of Physical Chemistry. Although the reviewers listed above provided many constructive comments and suggestions, they were not asked to endorse the conclusions or recommenda- tions of this report nor did they see the final draft before its release. The review of this report was overseen by Carl Lineberger, University of Colorado, Boulder and Julia Phillips, Sandia National Laboratories (Retired). They were responsible for making certain that an independent examination of this report was carried out in accordance with the standards of the National Academies and that all review comments were carefully considered. Responsibility for the final content rests en- tirely with the authoring committee and the National Academies. The report would not have been possible without the sponsor of this study, the Laura and John Arnold Foundation, whom we thank for their support. The committee gratefully acknowledges all of the speakers for their informative presentations at our meeting and public symposium. They are listed in Appendix E at the conclusion of the report. The information provided during the meeting and symposium is used throughout this report and provided important perspec- tives that were utilized in this reportâs findings and conclusions.
Preface ix The committee is also grateful for the assistance of the National Academies staff in preparing this report. Staff members who contributed to this effort are Tom Arrison, program director, Policy and Global Affairs; Emi Kameyama, as- sociate program officer, Board on Research Data and Information; George Strawn, director, Board on Research Data and Information; Ester Sztein, deputy director, Board on Research Data and Information; Nicole Lehmer, senior pro- gram assistant, Board on Research Data and Information; Alan Anderson, con- sultant; Christine Liu, senior program officer (through October 2017); Adriana Courembis, financial officer; Marilyn Baker, director for reports and communica- tion; and John Boright, interim executive director, Policy and Global Affairs. Finally, I thank especially the members of the committee for their tireless efforts throughout the development of this report. Alexa T. McCray, Chair Committee on Toward an Open Science Enterprise ix
Contents ABBREVIATIONS AND ACRONYMS ....................................................... xiii SUMMARY ........................................................................................................ 1 1 INTRODUCTION .................................................................................... 15 Context for the Study, 17 Study Process, 19 Structure of the Report, 19 2 BROADENING ACCESS TO THE RESULTS OF SCIENTIFIC RESEARCH ...................................................................... 23 Summary Points, 23 Origins and Significance of Open Science, 23 Motivations for Open Science, 30 Barriers to Open Science, 37 3 THE STATE OF OPEN SCIENCE ......................................................... 59 Summary Points, 59 General State of Open Science, 59 Current Approaches to Open Science, 63 4 A VISION FOR OPEN SCIENCE BY DESIGN.................................. 107 Summary Points, 107 Principles of Open Science by Design, 107 Practicing Open Science by Design, 108 Enabling Technologies for Open Science by Design, 111 Strengthening Training for Open Science by Design, 117 Other Considerations, 119 5 TRANSITIONING TO OPEN SCIENCE BY DESIGN ..................... 121 Summary Points, 121 Barriers and Limitations, 121 Legal Framework, 122 Research Funder Policies, 126 Strategies for Achieving Open Science by Design, 131 xi
xii Contents 6 Accelerating Progress to Open Science by Design ............................... 149 Recent Developments, 150 Findings, Recommendations, and Implementation Actions, 151 REFERENCES ............................................................................................... 161 APPENDIXES A COMMITTEE MEMBER BIOGRAPHIES ........................................ 189 B GLOSSARY ............................................................................................ 195 C OFFICE OF SCIENCE AND TECHNOLOGY POLICY (OSTP) 2013 MEMORANDUM: INCREASING ACCESS TO THE RESULTS OF FEDERALLY FUNDED SCIENTIFIC RESEARCH .................................................................... 199 D OFFICE OF SCIENCE AND TECHNOLOGY POLICY 2014 MEMORANDUM: IMPROVING THE MANAGEMENT OF AND ACCESS TO SCIENTIFIC COLLECTIONS ..................................................................................... 207 E COMMITTEE MEETING AGENDAS: OPEN SESSION ................. 213
Abbreviations and Acronyms AARNET Australiaâs Academic and Research Network AAU American Association of Universities ACS American Chemical Society AGU American Geophysical Union ALPSP Association of Learned and Professional Society Publishers ANDS Australian National Data Service APC Article Processing Charges APHIS Animal and Plant Health Inspection Service API Application Programming Interface APLU Association of Public and Land-Grant Universities APO Apache Point Observatory APOGEE Apache Point Observatory Galactic Evolution Experiment ARC Astrophysical Research Consortium ARS Agricultural Research Service (U.S. Department of Agriculture) ARXIV Archive AWSI Amazon Web Services BIA Bureau of Indian Affairs BIORXIV Bio Archive BLM Bureau of Land Management BOAI Budapest Open Access Initiative BOSS Baryon Oscillation Spectroscopic Survey BRDI Board on Research Data and Information CADRE Center for the Advancement of Data and Research in Economics CAPTCHA Completely Automated Public Turing Test to Tell Computers and Humans Apart CC Creative Commons CC BY Creative Commons Attribution CC BY-NC Creative Commons Attribution-Noncommercial CC BY-ND Creative Commons Attribution-Noderivs License CC BY-SA Creative Commons Attribution-Sharealike CC0 Cc Zero CDL California Digital Library CERN Conseil EuropÃ©en Pour La Recherche NuclÃ©aire (European Organization for Nuclear Research) CHORUS Clearinghouse for the Open Research of the United States COAR Confederation of Open Access Repositories CODATA Committee on Data for Science and Technology COPE Committee on Publication Ethics COS Center for Open Science xiii
xiv Abbreviations and Acronyms CRISPR/CAS9 Clustered Regularly Interspaced Short Palindromic Repeats/Crispr Associated Protein 9 CSIRO Commonwealth Scientific and Industrial Research Organization DASH Digital Access to Scholarship at Harvard DDI Design, Development, and Implementation DFIG Data Fabric Interest Group DMP Data Management Platform DO Digital Object Architecture DOAJ Directory of Open Access Journals DOE U.S. Department of Energy DOI U.S. Department of Interior DOIS Digital Object Identifiers DVN Dataverse Network EC European Commission EMBL-EBI The European Molecular Biology Laboratory-The European Bioinformatics Institute EOSC European Open Science Cloud EPA U.S. Environmental Protection Agency ESO European Southern Observatory ESSOAR Earth and Space Science Open Archive EU European Union EUA European University Association FAIR Findable, Accessible, Interoperable, and Reusable FASTR Fair Access to Science and Technology Research FDA Food and Drug Administration FDAAA Food and Drug Administration Amendments Act FOSTER Facilitate open science training for European Research FRED Federal Reserve Economic Data Site FSIS Food Safety and Inspection Service FWS U.S. Fish and Wildlife Service G7 Group of Seven GCMS Geologic Collections Management System GNU General Public License GO FAIR Global Open Findable, Accessible, Interoperable, and Reusable HEP High Energy Physics HIPAA Health Insurance Portability and Accountability Act HOA Hybrid Open Access HOAP Harvard Open Access Project ICPSR Inter-university Consortium for Political and Social Research ICSU International Council for Science IDW International Data Week IEDA Interdisciplinary Earth Data Alliance IEEE Institute of Electrical and Electronics Engineers IGSN International Geo Sample Number IMDB Internet Movie Database IODP International Ocean Discovery Program IoT Internet of Things
Abbreviations and Acronyms xv IP Internet Protocol IRIS Incorporated Research Institutions for Seismology IT Information Technology IQSS Institute for Quantitative Social Science IUPUI Indiana University-Purdue University Indianapolis IWGSC Interagency Working Group on Scientific Collections JIF Journal Impact Factor JISC Joint Information Systems Committee JSTOR Journal Storage LIBER Library Federations LOD Linked Open Data LSST Large Synoptic Survey Telescope MAGIC Magnetics Information Consortium MANGA Mapping Nearby Galaxies at Apo MDPI Multidisciplinary Digital Publishing Institute MEDOANET Mediterranean Open Access Network MIT Massachusetts Institute of Technology MPDL Max Planck Digital Library NASA National Aeronautics and Space Administration NASEM National Academies of Science, Engineering, and Medicine NBER National Bureau of Economic Research NDS National Data Service NIFA National Institute of Food and Agriculture NIH National Institutes of Health NIST National Institute of Standards and Technology NLM National Library of Medicine NOAA National Oceanic and Atmospheric Administration NPS National Park Service NRC National Research Council NRCS Natural Resources Conservation Service NSF National Science Foundation NSFNET National Science Foundation Network NSTC National Science and Technology Council NUTRIXIV Nutritional Sciences Archive OA Open Access OAD Open Access Directory OASPA Open Access Scholarly Publishers Association OECD Organization for Economic Co-operation and Development OPENAIRE Open Access Infrastructure for Research in Europe ORCID Open Researcher and Contributor ID ORFG Open Research Funders Group OSC Open Science Collaboration OSF Open Science Framework OSPP Open Science Policy Platform OSTP Office of Science and Technology Policy PCR Polymerase Chain Reaction PEERJ Peer-Reviewed Journal
xvi Abbreviations and Acronyms PII Personally Identifiable Information PLOS Public Library of Science PMC PubMed Central PNAS Proceedings of the National Academy of Sciences R&D Research and Development RCT Randomized Controlled Trial RDA Research Data Alliance RE3DATA Registry of Research Data Repository REPEC Research Papers in Economics RNA Ribonucleic Acid ROARMAP Registry of Open Access Repository Mandates and Policies SCADA Supervisory Control and Data Acquisition SCICOLL Scientific Collections International SCOAP3 Sponsoring Consortium for Open Access Publishing in Particle Physics SDSS Sloan Digital Sky Survey SESAR System for Earth Sample Registration SHARE SHared Access Research Ecosystem SLAC Stanford Linear Accelerator Center SPARC Scholarly Publishing and Academic Resources Coalition STEM Science, Technology, Engineering and Mathematics TCP/IP Transmission Control Protocol/ Internet Protocol TOP Transparency and Openness Promotion UC University of California UK United Kingdom UNESCO United Nations Education, Scientific and Cultural Organization USDA U.S. Department of Agriculture USFS U.S. Forest Service USFSC U.S. Federal Scientific Collections USGS U.S. Geological Survey UUID Universally Unique Identifier WAME World Association of Medical Editors WDS World Data System