Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
6 Accelerating Progress to Open Science by Design The benefits of open science are accruing to researchers themselves, re- search sponsors, research institutions, disciplines, and scholarly communicators. Yet, despite significant progress toward creating an open science ecosystem, to- dayâs science is not completely open. Most scientific articles are only available on a subscription basis. Sharing data, code, and other research products is becom- ing more common, but is still not routinely done across all disciplines. Barriers to more rapid progress include an academic culture and researcher incentives that can work against open science, insufficient infrastructure and training, issues re- lated to data privacy and national security, and the economic structure of the scholarly communications market. Open science also needs to overcome less defined sources of skepticism, which it can only do by proving its value to the research enterprise over time. Many important transformations and innovations in the history of science, and in history more broadly, have been opposed at first because of difficulty in quanti- fying or even imagining the benefits. For example, much of the biomedical re- search community was strongly opposed to the Human Genome Project when it was first proposed, believing that it diverted resources from more valuable inves- tigator-driven work (Palca, 1992). The project and its impact look much different in hindsight. Todayâs advances in biomedical research, and many other fields such as archaeology, would not be imaginable without genomic mapping and analysis. Also, researchers who are used to a framework where they are accountable to colleagues, to their disciplines, and to their institutions may be uneasy with open scienceâs implication that they are or should be accountable to the broader public. The open science movement stands at an important inflection point. A new generation of information technology tools and services holds the potential of fur- ther revolutionizing scientific practice. For example, the ability to automate the process of searching and analyzing linked articles and data can reveal patterns that would escape human perception, making the process of generating and testing hypotheses faster and more efficient. These tools and services will have maximum impact when used within an open science ecosystem that spans institutional, na- tional, and disciplinary boundaries. At the same time, a number of organizations around the world are adopting new policies and launching new initiatives aimed at fostering open science. 149
150 Open Science by Design: Realizing a Vision for 21st Century Research The vision of open science by design presented in this report seeks to enable the large population of stakeholders to move more rapidly toward open science as the default condition for the research they support. These stakeholders include the researchers themselves, universities, private and nonprofit organizations, publish- ers and journal editors, scientific societies, the philanthropic community, and fed- eral agencies. Despite the barriers that must still be overcome to implement open science, the momentum of the movement toward open science is generally appar- ent, and strategies for accelerating access have been outlined by many members of the scientific community. To help accelerate this progress further, the commit- tee has reviewed several recent recommendations, including those of a report by the Association of American Universities (AAU) and Association of Public and Land-grant Universities (APLU) and the European Open Science Cloud (EOSC) Declaration, before developing an action statement for specific stakeholders. RECENT DEVELOPMENTS AAU-APLU Public Access Working Group Report A joint working group on public access convened by the AAU and APLU released a report in November 2017 that provides recommendations and summa- rizes actions for federal agencies and universities to advance public access to data in a sustainable manner. The report recognizes that a significant culture shift at universities and among their faculty is required, in addition to carefully crafted new federal policies and investment in data infrastructure that support open access (APLU-AAU, 2017). The report also suggests, âby committing to a set of shared principles and minimal levels of standardization across institutions and agencies, we can help minimize costs, enhance interoperability between institutions and disciplines, and maximize the control institutions can exert over how they ensure access to publicly funded scholarshipâ (AAU-APLU, 2017, p. 1). EOSC Declaration Internationally, the European Commission released the EOSC Declaration in October 2017 calling on all scientific stakeholders to endorse and commit to the principles of the declaration by 2020. The declaration, which emerged as a result of the EOSC Summit held in June 2017, recognizes the challenges of data- driven research in pursuing excellent science; grants the vision of European Open Science as widely inclusive of all disciplines and Member States in the long term; and confirms the implementation of the EOSC as a process based on constant learning and mutual alignment (EC, 2017a). Regarding data culture, it notes that âonly a considerable cultural change will enable long-term reuse for science and for innovation of data created by research activities: no disciplines, institutions or countries must be left behindâ (EC, 2017a, p. 1).
Accelerating Progress to Open Science by Design 151 FINDINGS, RECOMMENDATIONS AND IMPLEMENTATION ACTIONS The Committee on Toward an Open Science Enterprise has developed the following set of findings and recommendations based on its review and synthesis of the information gathered throughout the course of the study. Each recommen- dation is the focus of a section that includes a discussion of relevant issues draw- ing on other parts of the report and a set of findings. Each of the five recommen- dations is followed by implementation actions specifying agencies, universities, or other organizations to guide stakeholder efforts to fostering open science by design. Building a Supportive Culture The motivations for and barriers to open science discussed in Chapter 2 present something of a paradox, which is clearly expressed by Nosek et al. (2015): Transparency, openness, and reproducibility are readily recognized as vital features of science. When asked, most scientists embrace these features as disciplinary norms and values. Therefore, one might expect that these val- ued features would be routine in daily practice. Yet, a growing body of ev- idence suggests that this is not the case. The actual and anticipated benefits of open science include more reliable knowledge, more rapid and creative generation of results, and broader and more inclusive participation in the research process. Significant barriers to wider and quicker adoption of open practices include the incentives and underlying cultural assumptions that operate in many fields. The specific ways in which cultural barriers to open science operate vary significantly by field or discipline. Overuse and misuse of bibliographic metrics such as the Journal Impact Factor in the evaluation of research and researchers is one important âbugâ in the operation of the research enterprise that has a detri- mental effect across disciplines, as explained in Chapter 2. The perception and/or reality that researchers need to publish in certain venues in order to secure funding and career advancement may lock researchers into traditional, closed mechanisms for reporting results and sharing research products. These pressures are particu- larly strong for early career researchers. Initiatives such as the San Francisco Declaration on Research Assessment seek to achieve broad buy-in on the part of stakeholders to move toward evalua- tion systems that use other methodologies. Concrete actions, such as the National Institutes of Health (2017a) decision to encourage investigators to use and cite interim research products such as preprints in seeking funding, can have a bene- ficial effect. Continued effort by stakeholders, working internationally and across disci- plinary boundaries, is needed to change evaluation practices and introduce other
152 Open Science by Design: Realizing a Vision for 21st Century Research incentives so that the cultural environment of research better supports and rewards open practices. Findings â¢ The culture of academia does not adequately reward and support researchers engaged in open science practices. â¢ University tenure and promotion committees give credit for journal publi- cations, but rarely give explicit credit to investigators who make their pub- lications and data openly available for use by the broader community and thus do not incentivize such practices. â¢ There are increasing opportunities for authors to make their research prod- ucts openly available. Many high-quality open access journals exist. An in- creasing number of high-quality open access publishers are supported by philanthropy and host institutions and offer fee waivers to authors in case of economic hardship (Shieber, 2009; Lawson, 2015). There are even peer- reviewed open access publishers that charge a nominal article processing charge or none at all. The Directory of Open Access Journals can be searched to find appropriate journals (DOAJ, 2018). Many journal publish- ers do not prohibit prospective authors from depositing their initial manu- scripts in preprint servers. Most journal publishers do not prohibit authors from posting their accepted articles on their personal websites or depositing them in their universityâs open access repository. Most federal agencies re- quire deposit of federally funded research results in public repositories. â¢ Journal articles are currently the primary method for summarizing and shar- ing scientific results, and the journalâs impact factor plays a large role in the assessment of academic achievement. In the digital age, while the journal framework may well continue for branding and content integration pur- poses, compiling articles in journals for distribution is no longer a require- ment for broad distribution. Recommendation One Research institutions should work to create a culture that actively supports Open Science by Design by better rewarding and supporting researchers en- gaged in open science practices. Research funders should provide explicit and consistent support for practices and approaches that facilitate this shift in culture and incentives. Implementation Actions â¢ Universities and other research institutions should explicitly reward the ef- fort needed to make science open by design.
Accelerating Progress to Open Science by Design 153 â¢ Universities and other research institutions should partner with federal agencies in developing innovative approaches to assessing the impact of research in ways that include the impact of open science outputs. This should include, but is not limited to, the development of metrics for as- sessing the impact of interim research products such as preprints, with a view toward comparing those with existing methods for measuring impact. â¢ Universities and other research institutions should move toward evaluating published data and other research products in addition to published articles as part of the promotion and tenure process. Archived data should be valued, just as the publications that result from them are valued. â¢ Researchers should make full use of the many opportunities that are availa- ble for making their research products openly available, and they should include that information in their curriculum vitae so that they can be appro- priately credited and rewarded. â¢ In fields where this is not already common practice, research funders should encourage and reward the use of data and other research products that are available in publicly accessible databases. â¢ Universities and other research institutions should encourage and reward studies that focus on the replication and reproducibility of published re- search. Such studies should be published and made openly available. Training for Open Science by Design The importance of training for open science by design is discussed in sev- eral places in the report, particularly Chapter 4. Initiatives such as the European Unionâs FOSTER project and the Berkeley Initiative for Transparency in the So- cial Sciences (BITSS) have emphasized training in open science and reproduci- bility. The emergence of data science as a recognized interdisciplinary field has highlighted the need for new educational content and approaches related to data (NASEM, 2018a). Several federal agencies require that students or trainees supported by grants receive training in the responsible conduct of research, or RCR (NASEM, 2017b). Training and education that covers issues such as open science and repro- ducibility would complement the existing focus of RCR education and orient these programs toward supporting both research integrity and quality. Findings â¢ Few academic institutions provide formal training and education in the prin- ciples and practices of open science. â¢ The university library community has an important role to play in the prom- ulgation and support of open science principles and practices.
154 Open Science by Design: Realizing a Vision for 21st Century Research â¢ Federal training programs, while requiring training in the responsible con- duct of research, do not explicitly require training in the many aspects of open science principles and practices. Recommendation Two Research institutions and professional societies should train students and other researchers to implement open science practices effectively and should support the development of educational programs that foster Open Science by Design. Implementation Actions â¢ Universities should provide training in best practices for open science and data stewardship as part of the regular curriculum in graduate and postgrad- uate education and should expect these practices in all onboarding/orienta- tion processes of universities, including new student orientation, new fac- ulty orientation, library orientations, and lab training as a default. Course curricula should be developed and implemented to complement domain- specific courses that support open science by design. â¢ Research funders should support the development of training programs in the principles and practices of open science by design. Federal agencies should require this training as part of all federally funded graduate training grants (e.g., NSF research traineeships and NIH training grants) to foster open science by design. â¢ Library and information science schools, professional societies, and other interested organizations should develop course curricula and offer courses in the principles and practices of open science. â¢ Research funders and professional societies should create programs or con- tests that seek the creative and innovative integration and (re)use of open data for new and impactful research. â¢ The private sector and other interested parties should create innovative ed- ucational tools for open science principles and practices. Ensuring Long-Term Preservation and Stewardship The issues and challenges related to preservation and stewardship of re- search products, particularly data, code, and other nonarticle products, are consid- ered in several places in the report. On the one hand, some of the technical and cost barriers to long-term data stewardship are falling, as tools for automated metadata tagging and classification become more widely used and cloud storage becomes cheaper over time. At the same time, the outputs of research continue to grow in volume and complexity, meaning that significant additional resources will still be required. In addition, ensuring preservation and long-term stewardshipâ
Accelerating Progress to Open Science by Design 155 particularly beyond the time period specified by the grantârequires standards and institutional capabilities that need to be developed by stakeholders and updated over time. Findings â¢ Ensuring long-term preservation and stewardship of data and other research products requires a commensurate long-term commitment of resources. â¢ Public access to data and scientific collections created with federal support is required by federal agencies but the infrastructure and funding to store; curate; and preserve data, code, samples, and other research products are not necessarily available. â¢ Although some of the technical and cost barriers to large-scale data storage are falling, the outputs of research continue to grow in volume and com- plexity, meaning that significant additional resources will still be required. Significant cultural and institutional barriers also remain. â¢ The library community, including archivists, curators, and other infor- mation scientists, play an important role in effecting long-term preservation and stewardship. â¢ Scientific disciplines vary to the extent that data and other research products are shared and archived. â¢ Not all data and other research products should be preserved for the long term, and most research communities do not have well-defined criteria for determining what data and physical collections should be preserved and for what length of time. The rise of interdisciplinary research implies that data preservation criteria should consider possible use outside of the discipline in which the research was originally conducted. â¢ Most federal agencies require a data management plan as part of grant ap- plications, although there is insufficient guidance for compliance expecta- tions and institutional responsibilities. â¢ Developing and sustaining the infrastructure required for long-term stew- ardship of research products will present a continuing challenge. The work of developing necessary standards and policies on the part of stakeholders will enable effective planning of new infrastructure and associated financ- ing. â¢ Approaches should be flexible enough to adapt and change over time. The size and complexity of data in many fields are changing rapidly, so that the solutions that are effective today might not be effective in a few years. At the same time, we have seen new tools and platforms continue to emerge that allow researchers to address challenges that were previously intracta- ble.
156 Open Science by Design: Realizing a Vision for 21st Century Research Recommendation Three Research funders and research institutions should develop the policies and procedures to identify the data, code, specimens, and other research products that should be preserved for long-term public availability, and they should provide the resources necessary for the long-term preservation and steward- ship of those research products. Implementation Actions â¢ Research institutions, professional societies, and research funders should work together to develop selection guidelines and long-term stewardship best practices for the most valuable community datasets and other research products. â¢ Federal agencies should, consistent with the 2013 and 2014 Office of Sci- ence and Technology Policy (OSTP, 2013, 2014) memoranda for expanding public access to the results of federally funded research, continue to develop and standardize requirements for research products planning, management, reporting, and stewardship. â¢ Private research funders who have not already done so should adopt ap- proaches compatible with those developed for publicly funded research products planning, management, reporting, and stewardship. â¢ Researchers should describe the plan for dissemination and stewardship of their research products with some specificity, consistent with the standard- ized sponsor requirements described above, including where their research products will be made publicly available and for what period of time. â¢ Research funders and research institutions should work together to resource and provide the infrastructure needed for long-term preservation, steward- ship, and community control of research products. This infrastructure could be supported through direct costs or through an ear-marked percentage of each funded grant. Facilitating Data Discovery, Reuse, and Reproducibility As progress toward open science by design continues, it is important that the community adhere to the ultimate goal of achieving the availability of research products under FAIR (findable, accessible, interoperable, reusable) principles. Open science under FAIR principles has the potential to deliver benefits to those researchers and disciplines that are participating, which will help make the case for supporting openness. Utilizing advanced machine learning tools in analyzing datasets or literature, for example, will facilitate new insights and discoveries. Ensuring FAIR access should be a key consideration in deciding how to build repositories and other new resources.
Accelerating Progress to Open Science by Design 157 As is the case with ensuring long-term stewardship, new standards should be developed by funders in collaboration with research institutions and research- ers. Fields and disciplines that do not already have well-developed standards and practices for making research products available under FAIR principles will need time and help to create these. Where meeting new standards imposes costs, fun- ders should make the necessary resources available. Open science will be realized more quickly and effectively by avoiding the imposition of unfunded mandates. Specific actions enabling a transition need to be developed in a transparent man- ner, and avoid disrupting researchers and their work to the extent possible. Findings â¢ It is difficult to determine how much data (open or otherwise) are generated through federally sponsored research projects and where they can be found. It is difficult to plan agency or budgetary data strategies based on this miss- ing information. â¢ For certain types of data in several disciplines (e.g., computational biology, genomics, proteomics), papers cannot be submitted to major journals unless the relevant data have already been deposited in an open domain repository. This has facilitated the discovery and reuse of data as well as the reproduc- ibility of research. At the same time this has only happened in a small num- ber of fields. â¢ It is difficult to discover datasets and code through search, making the âfind- ableâ part of the FAIR principles challenging. â¢ There is considerable variation among different disciplines for what consti- tutes ethical practices in the publication and usage of open data. â¢ Public access to research data is not sufficient to ensure usability and enable reuse. Uncurated data are often difficult to use. Data curation, management, and stewardship allow for optimal discovery, reuse, and validation of the results of scientific research. â¢ The value of open data depends heavily on the proper usage of such data, which in turn relies on a proper understanding of how the data were gener- ated and organized. Disciplinary differences are considerable, and some very large and complex datasets require considerable knowledge and exper- tise to use effectively. â¢ For most researchers, the amount of the relevant published literature is be- yond the human capacity to gather, read, and analyze without the assistance of automated discovery and analytical tools. Such tools are in development, but that development is impeded by the lack of ready access to the entire corpus of published scientific research by tool developers. â¢ Open access publications are legally available for all, although not all open access publishers make their content readily available for bulk transfer to tool developers or users of text and data mining tools.
158 Open Science by Design: Realizing a Vision for 21st Century Research â¢ Subscription publishers have varying policies concerning the availability and use of their publications for text and data mining, with the largest pub- lishers making this content available only under the terms of a negotiated license agreement. â¢ Open access to the data and metadata, along with the code used to generate and/or interpret those data, supports reproducibility, replicability, and the reliability of reported results. Recommendation Four Funders that support the development of research archives should work to ensure that these are designed and implemented according to the FAIR data principles. Researchers should seek to ensure that their research products are made available according to the FAIR principles and state with specific- ity any exceptions based on legal and ethical considerations. Implementation Actions â¢ Researchers should preferentially use open repositories that have been de- signed for interoperability and ease of discovery. â¢ Research funders should work to ensure that research products are available in repositories that allow for bulk transfer of digital objects to developers or users of automated discovery and analysis tools. â¢ Researchers and research funders should require that research products des- ignated for long-term preservation and stewardship are assigned persistent unique digital identifiers. â¢ Professional societies and research funders should support efforts to net- work and federate existing repositories for improved discoverability. â¢ Research funders should continue to support the development of methods and tools that improve the interoperability of heterogeneous data. Metadata schemes, commonly accepted workflows for the processing and analysis of data, and other standards should be developed and used for improved data discovery. â¢ Research funders should commission an independent assessment of the state of university and federal data archives. The assessment should address how the FAIR principles have or have not been adhered to and make rec- ommendations for improving accessibility to distributed or federated ar- chives. Developing New Approaches to Fostering Open Science by Design As the report discusses in Chapters 3 and 5, there is a great deal of activity on the part of public and private research funders, research institutions, commer- cial and nonprofit publishers, community-organized groups, and others aimed at
Accelerating Progress to Open Science by Design 159 preparing for and shaping a future research enterprise characterized by open sci- ence. Significant progress has been made, but a great deal of work needs to be done before open science by design is a reality. The committee focused on the choices facing U.S. organizations and institutions, realizing that the transition to open science by design is inherently a global process. Chapter 5 describes a number of issues, a few possible scenarios, and op- tions for action. The recent AAU-APLU report emphasizes the need for federal and other research sponsors to clarify requirements. In addition, revisiting federal policies supporting open science would allow for approaches to be modified and updated. Specific actions enabling a transition need to be developed in a transpar- ent manner, and avoid disrupting researchers and their work to the extent possible. The research enterprise is at an important point in the transition to open science, where research sponsors, both public and private, have an opportunity to shape the future through their investments. Findings â¢ Significant progress in open science practices has been made in recent years, but the majority of research products are not open, and very little research output meets the FAIR guidelines. â¢ Many, though not all, research funder policies are moving toward open sci- ence principles and practices. â¢ Infrastructure for open science is being designed and deployed, although with variation across fields of study. â¢ Disciplinary preprint servers, such as arXiv, RePEc and BioRxiv, have suc- cessfully provided an open platform to post prepublication versions of man- uscripts at no charge. These platforms have had an important positive effect on these disciplines. â¢ Open publications and open data provide an opportunity for the private sec- tor and others to develop useful products for researchers and other commu- nities. â¢ The current subscription-based business model for many publishers con- flicts with the goal of immediate open access to publications and data. â¢ Article processing charges are a possible replacement for subscription fees as a business model, but they also have limitations, since the payment of the charges will still be a burden on some part of the ecosystem and will fall unevenly on different stakeholders. â¢ Certain approaches to implementing open publication have the potential to affect the research ecosystem in significant ways, with differential impacts on different stakeholders. In planning new policies and transitions, it will be necessary to anticipate differential impacts to the extent possible, con- sider ways of avoiding these, and build in evaluative and corrective mech- anisms to address unanticipated consequences.
160 Open Science by Design: Realizing a Vision for 21st Century Research Recommendation Five The research community should work together to realize Open Science by Design to advance science and help science better serve the needs of society. Implementation Actions â¢ The federal government should revisit and update its open science policy, which is expressed in the 2013 and 2014 OSTP memoranda. â¢ Funders, institutions, and researchers should align policies and incentives to realize open publication, including rights-retention provisions. â¢ Research funders should support the establishment of a consortium of re- search community stakeholders to develop additional concrete methods for implementing open science by design. â¢ Professional societiesâindividually and collectivelyâshould work to tran- sition from current business models to new ones that foster open science by design. â¢ Journal editors should work with publishers to transition from current busi- ness models to new ones that foster open science by design. â¢ Research funders should explore innovative means to support the transition from subscription-based systems to new publication strategies that enable open science by design. â¢ Librarians should work together with other members of the research com- munity to promote and implement open science by design. â¢ The research community should develop tools and other applications that depend on the long-term availability of open research products, thereby providing new sources of revenue for the private sector, enhancing the value of research products, and leading to an acceleration of scientific progress.