Modern science is driven by computations and simulations. Discoveries and insight often come from complex simulations of climate, space weather, and astronomical phenomena. At the same time, scientific work requires regular data processing, presentation, and analysis through broadly available proprietary and community software. Implicitly or explicitly, software is central to science. Scientific discovery, understanding, validation, and interpretation are all enhanced by access to the source code of the software used by scientists.
The National Aeronautics and Space Act of 1958 created NASA as an organization whose first goal is the “expansion of human knowledge of phenomena in the atmosphere and space.”1 The act also directs NASA to “provide for the widest practicable and appropriate dissemination of information concerning its activities and the results thereof.” This mandate underpins NASA’s mission, and NASA has a long history of encouraging openness in the research it conducts and sponsors.
Before the 1990s, digital data sharing was cumbersome, involving mailed magnetic tapes, compact disks, or hard drives. The scientist who physically held the data-storage medium controlled access, thereby limiting scientific advancement and reproducibility of results. With the advent of inexpensive digital storage and fast transfer of information over the Internet, it became easier to share data, and agency policies began adapting. In 1994, the National Oceanic and Atmospheric Administration (NOAA) and NASA Earth Science Division (ESD) committed to a full and open data policy for all civil Earth observation satellites.2
Following the movement toward open data is a movement to open software. The 2000 report of the President’s Information Technology Advisory Committee, Developing Open Source Software to Advance High End Computing, recommended that the “Federal Government should encourage the development of open source software as an alternate path for software development for high end computing.”3 It also recommended an analysis of existing open source licenses that could be distributed to various agencies, and that “the use of common licensing agreements should be encouraged.”
3 President’s Information Technology Advisory Committee, 2000, Developing Open Source Software to Advance High End Computing, Report to the President, https://www.nitrd.gov/Pubs/pitac/pres-oss-11sep00.pdf.
NASA later published the report Developing an Open Source Option for NASA Software, which states in the introduction, “Open Source is about enhanced software quality, more efficient software development, and increased collaboration.”4 It also acknowledges that the Open Source Initiative (OSI) “provides the most widely recognized guidelines as to what constitutes open source.” The report reviews the leading open source licenses (all OSI-approved) and associated issues such as export controls, the directions in “External Release of NASA Software” (NASA Procedures and Guidelines [NPG] 2210.1A), contractor rights, and copyright for software created by government employees. In particular, it highlights Section 126.96.36.199.2 of the NPG, which states that “software that is joint work between NASA employees and NASA contractors is protected under copyright.” It finally proposes that NASA utilize the Mozilla Public License (MPL), avoiding the “need to develop yet another license and submit it to the OSI for approval.” The committee will consider these issues in Chapter 2.
NASA decided to develop a new license, the NASA Open Source Agreement (NOSA), specifically designed for software generated by civil servants, which was approved by the OSI.5 NOSA has not been widely accepted by the open source community due to what appears to be lack of understanding about the need for some of its unique provisions and differing interpretations about those same provisions, among other concerns. Chapter 2 will expand on issues associated with NOSA and review open source licensing in general.
To better understand the OSS process, NASA held three open source summits, in 2011, 2012, and 2013 to discuss how to improve the development and release of OSS at NASA, how to advance the use of OSS throughout a wider government audience, and how to engage with and encourage open source communities, respectively.6 Also, the website open.nasa.gov was created to promote the agency’s OSS, data, and application programming interfaces.
Meanwhile, the expansion of science enabled by making data open to the public has influenced a drive for even greater openness across the federal government. In 2013, the Office of Science and Technology Policy (OSTP) issued the memorandum “Increasing Access to the Results of Federally Funded Scientific Research.”7 The memo commits each research and development agency to ensure that “the direct results of federally funded scientific research are made available to and useful for the public, industry, and the scientific community. Such results include peer-reviewed publications and digital data.” Furthermore, it directs each agency to “develop a plan to support increased public access to the results of research funded by the Federal Government.” In response to the OSTP memo, in 2015 NASA developed a Plan for Increasing Access to the Results of Scientific Research, which addresses data, but not software.8
In 2016, the Office of Management and Budget (OMB) issued the memorandum “Federal Source Code Policy: Achieving Efficiency, Transparency, and Innovation Through Reusable and Open Source Software.”9 This memo requires agencies to release at least 20 percent of new custom-developed software as open source, meaning that the source code is available and licensed for reuse, to increase efficiency across the federal government. The NASA Office of the Chief Information Officer (OCIO) responded that beginning in 2017, NASA will comply with the requirement and also when contracting for software development, “NASA will encourage vendors to use open source technology wherever possible.”10 These policies are specific to development of information technology and software solutions and do not include any directives regarding scientific research software. Different scientific disciplines, even within NASA, have a variety of experience, familiarity, and comfort sharing data, models, and software. Some fields have already established a culture where openness is expected (see discussion in Section 3.3.1).
It is in this context that NASA’s Science Mission Directorate (SMD) requested that the National Academies of Sciences, Engineering, and Medicine conduct a study on “Best Practices for a Future Open Code Policy for
4 Moran, P.J., 2003, “Developing an Open Source Option for NASA Software,” NAS Technical Report NAS-03-nnn, https://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/20030054432.pdf.
6 C.A. Mattmann, D.J. Crichton, A.F. Hart, S.C. Kelly, C.E. Goodale, P. Ramirez, J.S. Hughes, R.R. Downs, and F. Lindsay, 2012, Understanding open source software at NASA, IT Professional 14(2):29-35, doi:10.1109/MITP.2011.118.
NASA Space Science.” The Committee on Best Practices for a Future Open Code Policy for NASA Space Science was formed with the following statement of task:
The National Academies of Sciences, Engineering, and Medicine will establish an ad hoc committee to investigate and recommend best practices for NASA as it considers whether to establish an open code and open models policy, complementary to its current open data policy. In carrying out the study the committee will:
- Review and describe examples of code/modeling policies developed by research teams and communities in the NASA-supported disciplines of Earth Science and Applications from Space, the Space Sciences, and other research communities, as appropriate;
- Develop a set of lessons learned from these established approaches—paying particular attention to issues such as, but not limited to, proprietary, export control, code/model maintenance, and documentation considerations;
- Define and describe options for policies on open software and open models for research supported by NASA Science Mission Directorate (SMD) and assess the pros and cons of these options from the perspective of the research community and the interests of NASA; and
- Recommend a set of best practices for NASA to consider should SMD decide to adopt an open code/open model policy for research supported by the agency. The committee may also choose to present alternate sets of best practices rather than just one recommended set.
The legal and executive directives above motivate this study, but they also occur within a larger context of an international movement toward greater transparency and openness of research as an accepted means to increase scientific rigor, expand knowledge, increase the pace of science, and benefit society. This trend is emphasized in a major National Academies report, Open Science by Design: Realizing a Vision for 21st Century Research, which was issued in July 2018 and stressed the benefits of open science, including rigor and reliability; faster and more inclusive dissemination of knowledge; broader participation in research; and effective use of resources.11 Open source practices are a key part of these, and indeed, fall under the first three major recommendations of Open Science by Design, which are listed below, and echo the findings and recommendations of the committee in the current report:
Research institutions should work to create a culture that actively supports Open Science by Design by better rewarding and supporting researchers engaged in open science practices. Research funders should provide explicit and consistent support for practices and approaches that facilitate this shift in culture and incentives.
Research institutions and professional societies should train students and other researchers to implement open science practices effectively and should support the development of educational programs that foster Open Science by Design.
Research funders and research institutions should develop the policies and procedures to identify the data, code, specimens, and other research products that should be preserved for long-term public availability, and they should provide the resources necessary for the long-term preservation and stewardship of those research products.
While the National Academies report discussed above stresses the benefits of open science, as NASA considers whether to establish an SMD-wide policy on open source software (OSS), complementing its open data policy,
11 National Academies of Sciences, Engineering, and Medicine, Open Science by Design: Realizing a Vision for 21st Century Research, The National Academies Press, Washington, DC, 2018, https://doi.org/10.17226/25116, pp. 7-10.
some members of the community may call for more explanation of the benefits of OSS to science. The statement of task for this report was to recommend best practices for an OSS policy, not to evaluate the costs and benefits of an OSS policy on NASA science, but a short introduction is relevant. In 2012, Morin et al. proposed the following: “Requiring that source code be made available upon publication would . . . yield substantial benefits—including improved code quality, reduced errors, increased reproducibility, and greater efficiency through code reuse and sharing.”12 More broadly, Sonnenburg et al.13 list the following advantages: (1) reproducibility of scientific results and fair comparison of algorithms; (2) uncovering problems; (3) building on existing resources (rather than reimplementing them); (4) access to scientific tools without cease; (5) combination of advances; (6) faster adoption of methods in different disciplines and in industry; and (7) collaborative emergence of standards. The benefits can be ascribed alternatively to the open source licensing model, or to the open source development model. For example, improved code quality and fewer errors stem from the latter, while reproducibility and efficiency via reuse stem from the former. The better code quality of OSS, versus proprietary or closed source, has been shown within industry settings.14 In science, evidence of OSS benefits is still mostly anecdotal, but strong by way of the counterexamples where errors in software have resulted in retracted papers or erroneous trends in data.15 The efficiency gains from open source development models and code reuse are illustrated plainly with community library development.16
In this context, an OSS policy informed by this study is a logical next step for SMD as it moves toward more openness. Developing such a policy for SMD involves complex considerations, in terms of legal and practical constraints, intellectual property, and different software types and applications. Changes in policy are difficult, and the success of a policy can depend on how it is implemented. Accordingly, the committee assembled “lessons learned” from related policy implementations and the community responses. The committee has set forth a number of policy options and recommendations rather than the requested “best practices” for NASA to consider (Task 4 of the statement of task). The policy options and recommendations highlight important considerations in the implementation of an OSS policy at NASA SMD.
The purposes of any OSS policies that SMD may develop are to serve the goals of SMD and NASA. Based on NASA’s vision, mission, and mandates; guidance to the committee from NASA representatives; and the general interests of science and society, the committee identified seven goals.
- Enhance and enable innovation and discovery.
- Increase the visibility, access, and reuse of NASA-funded code.
- Facilitate scientific reproducibility.
- Encourage collaboration inside and outside of NASA.
13 S. Sonnenburg, M.L. Braun, C.S. Ong, S. Bengio, L. Bottou, G. Holmes, Y. LeCun, et al., 2007, The need for open source software in machine learning, Journal of Machine Learning Research 8(Oct):2443-2466.
14 See, for example, Coverity, Inc., 2013, Coverity Scan: 2013 Open Source Report, http://softwareintegrity.coverity.com/rs/appsec/images/2013-Coverity-Scan-Report.pdf.
15 See, for example, G. Miller, 2006, A scientist’s nightmare: software problem leads to five retractions, Science 314(5807):1856-1857, https://doi.org/10.1126/science.314.5807.1856; J. Irons and J.Bivens, 2010, “Government Debt and Economic Growth Overreaching Claims of Debt ‘Threshold’ Suffer from Theoretical and Empirical Flaws”; H. Gee, 1998, Satellite climate record in error, Nature, doi:10.1038/news980820-1, https://www.nature.com/news/1998/980820/full/news980820-1.html; D.A.W. Soergel, 2014, Rampant software errors may undermine scientific results, F1000Research 3:303, doi:10.12688/f1000research.5930.2; D.C. Ince, L. Hatton, and J. Graham-Cumming, 2012, The case for open computer programs, Nature 482(7286):485; and B. Boehm, H.D. Rombach, and M.V. Zelkowitz (eds.), 2005, Foundations of Empirical Software Engineering: The Legacy of Victor R. Basili. Springer, New York.
16 A.M. Price-Whelan, B.M. Sipőcz, H.M. Günther, P.L. Lim, S.M. Crawford, S. Conseil, D.L. Shupe, M.W. Craig, N. Dencheva, A. Ginsburg, and J.T. VanderPlas, 2018. The Astropy Project: Building an Inclusive, Open-Science Project and Status of the v2. 0 Software, arXiv preprint, https://arxiv.org/abs/1801.02634. See also section 2.2 for discussion of community libraries.
- Maximize NASA’s benefit to society.
- Respect the security and privacy of citizens.
- Comply with broader government policies.
These goals helped guide the committee’s information-gathering process. They provided context for discussing lessons learned from existing policies and community perspectives, policy options, and implementation strategies. They will also guide efforts that SMD develops to assess the effectiveness of the policies they implement.
Overall, the committee operated on the maxim of “as open as possible; as closed as necessary.”
For the purpose of this study, the terms open code and open source software are used synonymously, as defined in Section 2.1. Open source software has generally become a term of choice in the software development community.
Aiming to properly gauge the science community’s perspectives, and understand possible consequences of an OSS policy, the committee held three in-person meetings that included presentations from diverse stakeholders. The committee also solicited community white papers and received 44 thoughtful submissions that describe a variety of experiences and were both supportive and concerned about an OSS policy. The committee also received legal guidance from an unpaid consultant.
The report is organized as follows: Chapter 2 provides important definitions, a short explanation of relevant legal issues, and an overview of open source licensing models and development models. Chapter 3 reviews existing policies and the lessons learned from the implementation of those policies (Tasks 1 and 2). Chapter 4 summarizes community perspectives with additional lessons learned (Task 2). Chapter 5 presents policy options and recommendations for implementation for NASA SMD (Tasks 3 and 4). Chapter 6 summarizes the committee’s findings and discusses implications for SMD.
The charge to the committee was to evaluate options for a NASA OSS policy, not to argue for or against such a policy. Short discussions on the value of open science and OSS are included in Section 1.1 as background, but in general, the report focuses on the information requested in the statement of task.