OPEN SCIENCE BY DESIGN
Realizing a Vision for 21st Century Research
Committee on Toward an Open Science Enterprise
Board on Research Data and Information
Policy and Global Affairs
A Consensus Study Report of
THE NATIONAL ACADEMIES PRESS
Washington, DC
www.nap.edu
THE NATIONAL ACADEMIES PRESS 500 Fifth Street, NW Washington, DC 20001
This activity was supported by the Laura and John Arnold Foundation. Any opinions, findings, conclusions, or recommendations expressed in this publication do not necessarily reflect the views of any organization or agency that provided support for the project.
International Standard Book Number-13: 978-0-309-47624-9
International Standard Book Number-10: 0-309-47624-0
Library of Congress Control Number: 2018950760
Digital Object Identifier: https://doi.org/10.17226/25116
Additional copies of this publication are available for sale from the National Academies Press, 500 Fifth Street, NW, Keck 360, Washington, DC 20001; (800) 624-6242 or (202) 334-3313; http://www.nap.edu.
Copyright 2018 by the National Academy of Sciences. All rights reserved.
Printed in the United States of America
Suggested citation: National Academies of Sciences, Engineering, and Medicine. 2018. Open Science by Design: Realizing a Vision for 21st Century Research. Washington, DC: The National Academies Press. doi: https://doi.org/10.17226/25116.
The National Academy of Sciences was established in 1863 by an Act of Congress, signed by President Lincoln, as a private, nongovernmental institution to advise the nation on issues related to science and technology. Members are elected by their peers for outstanding contributions to research. Dr. Marcia McNutt is president.
The National Academy of Engineering was established in 1964 under the charter of the National Academy of Sciences to bring the practices of engineering to advising the nation. Members are elected by their peers for extraordinary contributions to engineering. Dr. C. D. Mote, Jr., is president.
The National Academy of Medicine (formerly the Institute of Medicine) was established in 1970 under the charter of the National Academy of Sciences to advise the nation on medical and health issues. Members are elected by their peers for distinguished contributions to medicine and health. Dr. Victor J. Dzau is president.
The three Academies work together as the National Academies of Sciences, Engineering, and Medicine to provide independent, objective analysis and advice to the nation and conduct other activities to solve complex problems and inform public policy decisions. The National Academies also encourage education and research, recognize outstanding contributions to knowledge, and increase public understanding in matters of science, engineering, and medicine.
Learn more about the National Academies of Sciences, Engineering, and Medicine at www.nationalacademies.org.
Consensus Study Reports published by the National Academies of Sciences, Engineering, and Medicine document the evidence-based consensus on the study’s statement of task by an authoring committee of experts. Reports typically include findings, conclusions, and recommendations based on information gathered by the committee and the committee’s deliberations. Each report has been subjected to a rigorous and independent peer-review process and it represents the position of the National Academies on the statement of task.
Proceedings published by the National Academies of Sciences, Engineering, and Medicine chronicle the presentations and discussions at a workshop, symposium, or other event convened by the National Academies. The statements and opinions contained in proceedings are those of the participants and are not endorsed by other participants, the planning committee, or the National Academies.
For information about other products and activities of the National Academies, please visit www.nationalacademies.org/about/whatwedo.
COMMITTEE ON TOWARD AN OPEN SCIENCE ENTERPRISE
Alexa T. McCray (NAM) (Chair), Professor of Medicine, Harvard Medical School
Francine Berman, Edward P. Hamilton Distinguished Professor of Computer Science, Rensselaer Polytechnic Institute
Michael Carroll, Professor of Law, American University Washington College of Law
Donna Ginther, Professor, Department of Economics; Director, Center for Science, Technology and Economic Policy, University of Kansas
Robert Miller, Chief Executive Officer, LYRASIS
Peter Schiffer, Vice Provost for Research and Professor in Applied Physics, Yale University
Edward Seidel, Vice President for Economic Development and Innovation, University of Illinois System; Founder Professor of Physics, Professor of Astronomy and Computer Science, University of Illinois at Urbana-Champaign
Alex Szalay, Bloomberg Distinguished Professor of Astronomy, The Johns Hopkins University
Lisa Tauxe (NAS), Distinguished Professor of Geophysics, Scripps Institution of Oceanography, University of California, San Diego
Heng Xu, Associate Professor of Information Sciences and Technology, College of Information Sciences and Technology, The Pennsylvania State University
Principal Project Staff
Tom Arrison, Program Director, Policy and Global Affairs Division (from November 2017)
Emi Kameyama, Associate Program Officer, Board on Research Data and Information
George Strawn, Director, Board on Research Data and Information
Ester Sztein, Deputy Director, Board on Research Data and Information
Nicole Lehmer, Senior Program Assistant, Board on Research Data and Information
Alan Anderson, Consultant
Christine Liu, Senior Program Officer (until October 2017)
BOARD ON RESEARCH DATA AND INFORMATION
Alexa McCray (NAM) (Chair), Professor of Medicine, Harvard Medical School
Amy Brand, Director, MIT Press
Kelvin Droegemeier, Vice President for Research, University of Oklahoma
Stuart Feldman, Chief Scientist, Schmidt Futures
Salman Habib, Senior Physicist and Computational Scientist, Argonne National Laboratory
James Hendler, Director, Institute for Data Exploration and Applications, Rensselaer Polytechnic Institute
Elliot Maxwell, Chief Executive Officer, e-Maxwell & Associates
Barend Mons, Chair in Biosemantics, Leiden University Medical Center
Sarah Nusser, Vice President for Research, Iowa State University
Michael Stebbins, President, Science Advisors, LLC
Bonnie Carroll,* Chairman and Chief Executive Officer, Information International Associates (CODATA Secretary General)
John Hildebrand (NAS),* Regents Professor of Neuroscience, University of Arizona (NAS Foreign Secretary)
Paul Uhlir,* Consultant, Data Policy and Management (CODATA Executive Committee Member)
Staff
George Strawn, Director
Tom Arrison, Program Director
Ester Sztein, Deputy Director
Emi Kameyama, Associate Program Officer
Nicole Lehmer, Senior Program Assistant
___________________
* Denotes ex-officio member.
Preface
Just as society has been transformed by the digital revolution, so, too, have many aspects of the scientific enterprise. Publicly available data in federally sponsored databases serve as starting points for many research investigations. Collaborations are no longer hampered by geographic distance, and, in some cases, the majority or even all of the work is conducted by sharing digital research files, corresponding by email, and meeting virtually, with time zone differences being the only deterrent to the frequency of the meetings. Data are largely collected, stored, manipulated, and shared in electronic form. Research papers are prepared using word-processing software and are often formatted and submitted in camera-ready form to the publisher. The majority of published articles are no longer bound in print journals and disseminated by conventional postal delivery, but rather are available through the publisher’s database, most often mediated by contracts with institutional libraries.
This transformation has had economic, policy, and practical implications, many of which are still in the process of being fully addressed and resolved. An increasing number of scientists have begun to question the closed world of scientific publishing and have suggested that the results of their research should be openly available for all, to benefit not only fellow scientists, but also the general citizenry. Indeed, the pursuit of “citizen science” is now recognized as a valid and useful activity. Faculty at many universities have adopted university-wide “open access” policies that ensure that, at a minimum, their research papers are available through their institution’s repository.
New publishing venues have arisen, including open access journals, some of very high-quality and others not. Individual researchers, while interested in having their work broadly read and cited, are faced with competing pressures, including publishing in journals with high “impact factors,” such that they are in the best possible position for promotion and tenure.
Research funders have seen the value of openly sharing the results of the research that they have supported, not just in the form of publications, but also in the form of the data that have been produced in the course of the investigation. They have begun to require that applicants prepare data management plans as part of their grant proposals.
A number of legal and policy developments have facilitated broader access to scientific research. Recognizing the potential of the Internet to broadly and equitably disseminate scientific knowledge, a collaborative effort has created a legal framework that is consistent with U.S. copyright law, and that provides guidance to researchers who would like to have greater control over how their research results are used and disseminated. Several federal policies require that publicly
funded research results, in the form of data and publications, be deposited in public access repositories. Legislation is now also pending in Congress that would strengthen these policies.
To evaluate more fully the benefits and challenges of broadening access to the results of scientific research, described as “open science,” the National Academies of Sciences, Engineering, and Medicine appointed an expert committee in March 2017. Brief biographies of the individual committee members are provided in Appendix A. The committee was charged with focusing on how to move toward open science as the default for scientific research results, and to indicate both the benefits of moving toward open science and the barriers to doing so. This report presents the findings and recommendations of the committee, with the majority of the focus on solutions that move the research enterprise toward open science.
This Consensus Study Report was reviewed in draft form by individuals chosen for their diverse perspectives and technical expertise. The purpose of this independent review is to provide candid and critical comments that will assist the National Academies of Sciences, Engineering, and Medicine in making each published report as sound as possible and to ensure that it meets the institutional standards for quality, objectivity, evidence, and responsiveness to the study charge. The review comments and draft manuscript remain confidential to protect the integrity of the deliberative process.
We thank the following individuals for their review of this report: Prudence Adler, Association of Research Libraries; David Allison, Indiana University, Bloomington; Geoffrey Boulton, University of Edinburgh; Anita de Waard, Elsevier; Michael Forster, Institute of Electrical and Electronics Engineers; Laura Greene, Florida State University; Heather Joseph, Scholarly Publishing and Academic Resources Coalition; Véronique Kiermer, Public Library of Science; Michael Lesk, Rutgers University; William Mobley, University of California, San Diego; Mark Musen, Stanford University; Sarah Nusser, Iowa State University; and George Schatz, Journal of Physical Chemistry.
Although the reviewers listed above provided many constructive comments and suggestions, they were not asked to endorse the conclusions or recommendations of this report nor did they see the final draft before its release. The review of this report was overseen by Carl Lineberger, University of Colorado, Boulder and Julia Phillips, Sandia National Laboratories (Retired). They were responsible for making certain that an independent examination of this report was carried out in accordance with the standards of the National Academies and that all review comments were carefully considered. Responsibility for the final content rests entirely with the authoring committee and the National Academies.
The report would not have been possible without the sponsor of this study, the Laura and John Arnold Foundation, whom we thank for their support. The committee gratefully acknowledges all of the speakers for their informative presentations at our meeting and public symposium. They are listed in Appendix E at the conclusion of the report. The information provided during the meeting and symposium is used throughout this report and provided important perspectives that were utilized in this report’s findings and conclusions.
The committee is also grateful for the assistance of the National Academies staff in preparing this report. Staff members who contributed to this effort are Tom Arrison, program director, Policy and Global Affairs; Emi Kameyama, associate program officer, Board on Research Data and Information; George Strawn, director, Board on Research Data and Information; Ester Sztein, deputy director, Board on Research Data and Information; Nicole Lehmer, senior program assistant, Board on Research Data and Information; Alan Anderson, consultant; Christine Liu, senior program officer (through October 2017); Adriana Courembis, financial officer; Marilyn Baker, director for reports and communication; and John Boright, interim executive director, Policy and Global Affairs.
Finally, I thank especially the members of the committee for their tireless efforts throughout the development of this report.
Alexa T. McCray, Chair
Committee on Toward an Open Science Enterprise
This page intentionally left blank.
Contents
2 BROADENING ACCESS TO THE RESULTS OF SCIENTIFIC RESEARCH
Origins and Significance of Open Science
Current Approaches to Open Science
4 A VISION FOR OPEN SCIENCE BY DESIGN
Principles of Open Science by Design
Practicing Open Science by Design
Enabling Technologies for Open Science by Design
Strengthening Training for Open Science by Design
Abbreviations and Acronyms
AARNET | Australia’s Academic and Research Network |
AAU | American Association of Universities |
ACS | American Chemical Society |
AGU | American Geophysical Union |
ALPSP | Association of Learned and Professional Society Publishers |
ANDS | Australian National Data Service |
APC | Article Processing Charges |
APHIS | Animal and Plant Health Inspection Service |
API | Application Programming Interface |
APLU | Association of Public and Land-Grant Universities |
APO | Apache Point Observatory |
APOGEE | Apache Point Observatory Galactic Evolution Experiment |
ARC | Astrophysical Research Consortium |
ARS | Agricultural Research Service (U.S. Department of Agriculture) |
ARXIV | Archive |
AWSI | Amazon Web Services |
BIA | Bureau of Indian Affairs |
BIORXIV | Bio Archive |
BLM | Bureau of Land Management |
BOAI | Budapest Open Access Initiative |
BOSS | Baryon Oscillation Spectroscopic Survey |
BRDI | Board on Research Data and Information |
CADRE | Center for the Advancement of Data and Research in Economics |
CAPTCHA | Completely Automated Public Turing Test to Tell Computers and Humans Apart |
CC | Creative Commons |
CC BY | Creative Commons Attribution |
CC BY-NC | Creative Commons Attribution-Noncommercial |
CC BY-ND | Creative Commons Attribution-Noderivs License |
CC BY-SA | Creative Commons Attribution-Sharealike |
CC0 | Cc Zero |
CDL | California Digital Library |
CERN | Conseil Européen Pour La Recherche Nucléaire (European Organization for Nuclear Research) |
CHORUS | Clearinghouse for the Open Research of the United States |
COAR | Confederation of Open Access Repositories |
CODATA | Committee on Data for Science and Technology |
COPE | Committee on Publication Ethics |
COS | Center for Open Science |
CRISPR/CAS9 | Clustered Regularly Interspaced Short Palindromic Repeats/Crispr Associated Protein 9 |
CSIRO | Commonwealth Scientific and Industrial Research Organization |
DASH | Digital Access to Scholarship at Harvard |
DDI | Design, Development, and Implementation |
DFIG | Data Fabric Interest Group |
DMP | Data Management Platform |
DO | Digital Object Architecture |
DOAJ | Directory of Open Access Journals |
DOE | U.S. Department of Energy |
DOI | U.S. Department of Interior |
DOIS | Digital Object Identifiers |
DVN | Dataverse Network |
EC | European Commission |
EMBL-EBI | The European Molecular Biology Laboratory-The European Bioinformatics Institute |
EOSC | European Open Science Cloud |
EPA | U.S. Environmental Protection Agency |
ESO | European Southern Observatory |
ESSOAR | Earth and Space Science Open Archive |
EU | European Union |
EUA | European University Association |
FAIR | Findable, Accessible, Interoperable, and Reusable |
FASTR | Fair Access to Science and Technology Research |
FDA | Food and Drug Administration |
FDAAA | Food and Drug Administration Amendments Act |
FOSTER | Facilitate open science training for European Research |
FRED | Federal Reserve Economic Data Site |
FSIS | Food Safety and Inspection Service |
FWS | U.S. Fish and Wildlife Service |
G7 | Group of Seven |
GCMS | Geologic Collections Management System |
GNU | General Public License |
GO FAIR | Global Open Findable, Accessible, Interoperable, and Reusable |
HEP | High Energy Physics |
HIPAA | Health Insurance Portability and Accountability Act |
HOA | Hybrid Open Access |
HOAP | Harvard Open Access Project |
ICPSR | Inter-university Consortium for Political and Social Research |
ICSU | International Council for Science |
IDW | International Data Week |
IEDA | Interdisciplinary Earth Data Alliance |
IEEE | Institute of Electrical and Electronics Engineers |
IGSN | International Geo Sample Number |
IMDB | Internet Movie Database |
IODP | International Ocean Discovery Program |
IoT | Internet of Things |
IP | Internet Protocol |
IRIS | Incorporated Research Institutions for Seismology |
IT | Information Technology |
IQSS | Institute for Quantitative Social Science |
IUPUI | Indiana University-Purdue University Indianapolis |
IWGSC | Interagency Working Group on Scientific Collections |
JIF | Journal Impact Factor |
JISC | Joint Information Systems Committee |
JSTOR | Journal Storage |
LIBER | Library Federations |
LOD | Linked Open Data |
LSST | Large Synoptic Survey Telescope |
MAGIC | Magnetics Information Consortium |
MANGA | Mapping Nearby Galaxies at Apo |
MDPI | Multidisciplinary Digital Publishing Institute |
MEDOANET | Mediterranean Open Access Network |
MIT | Massachusetts Institute of Technology |
MPDL | Max Planck Digital Library |
NASA | National Aeronautics and Space Administration |
NASEM | National Academies of Science, Engineering, and Medicine |
NBER | National Bureau of Economic Research |
NDS | National Data Service |
NIFA | National Institute of Food and Agriculture |
NIH | National Institutes of Health |
NIST | National Institute of Standards and Technology |
NLM | National Library of Medicine |
NOAA | National Oceanic and Atmospheric Administration |
NPS | National Park Service |
NRC | National Research Council |
NRCS | Natural Resources Conservation Service |
NSF | National Science Foundation |
NSFNET | National Science Foundation Network |
NSTC | National Science and Technology Council |
NUTRIXIV | Nutritional Sciences Archive |
OA | Open Access |
OAD | Open Access Directory |
OASPA | Open Access Scholarly Publishers Association |
OECD | Organization for Economic Co-operation and Development |
OPENAIRE | Open Access Infrastructure for Research in Europe |
ORCID | Open Researcher and Contributor ID |
ORFG | Open Research Funders Group |
OSC | Open Science Collaboration |
OSF | Open Science Framework |
OSPP | Open Science Policy Platform |
OSTP | Office of Science and Technology Policy |
PCR | Polymerase Chain Reaction |
PEERJ | Peer-Reviewed Journal |
PII | Personally Identifiable Information |
PLOS | Public Library of Science |
PMC | PubMed Central |
PNAS | Proceedings of the National Academy of Sciences |
R&D | Research and Development |
RCT | Randomized Controlled Trial |
RDA | Research Data Alliance |
RE3DATA | Registry of Research Data Repository |
REPEC | Research Papers in Economics |
RNA | Ribonucleic Acid |
ROARMAP | Registry of Open Access Repository Mandates and Policies |
SCADA | Supervisory Control and Data Acquisition |
SCICOLL | Scientific Collections International |
SCOAP3 | Sponsoring Consortium for Open Access Publishing in Particle Physics |
SDSS | Sloan Digital Sky Survey |
SESAR | System for Earth Sample Registration |
SHARE | SHared Access Research Ecosystem |
SLAC | Stanford Linear Accelerator Center |
SPARC | Scholarly Publishing and Academic Resources Coalition |
STEM | Science, Technology, Engineering and Mathematics |
TCP/IP | Transmission Control Protocol/Internet Protocol |
TOP | Transparency and Openness Promotion |
UC | University of California |
UK | United Kingdom |
UNESCO | United Nations Education, Scientific and Cultural Organization |
USDA | U.S. Department of Agriculture |
USFS | U.S. Forest Service |
USFSC | U.S. Federal Scientific Collections |
USGS | U.S. Geological Survey |
UUID | Universally Unique Identifier |
WAME | World Association of Medical Editors |
WDS | World Data System |