Summary
The Committee that assembled this report was empanelled by the National Research Council (NRC) at the request of the National Oceanic and Atmospheric Administration-National Environmental Satellite, Data, and Information Service (NOAA-NESDIS) to provide advice on archiving and providing access to the broad range of environmental and geospatial data collected by NOAA and its partners. With limited resources and enormous growth in data volumes, NOAA seeks input on how to identify the observations, data and derived products that should be preserved in perpetuity and made readily accessible versus those that require limited access and storage lifetimes. Pursuant to its statement of task (Appendix B), and based on its collective experience, the materials it has reviewed to date, and its initial deliberations, the Committee proposes the following preliminary principles and guidelines for data archiving activities at NOAA. These preliminary principles and guidelines are not ranked, and should be regarded as a framework for further discussion; they will be further developed and expanded by this Committee in a final report that also addresses data access at NOAA.
-
The environmental and geospatial data collected by NOAA and its partners, including model output, are an invaluable resource that should be archived and made accessible in a form that allows researchers and educators to conduct analyses and generate products necessary to accurately describe the Earth System.
-
The decision to archive or continue to archive data or model output should be driven by its current or future value to society. The decision will need to take into account the cost to archive versus the cost to regenerate, as well as the costs of providing access to the data.
-
Funding for Earth System measurements should include sufficient resources to archive and provide ready and easy access to these data for an extended period of time. In particular, at the outset of undertaking an activity which will generate data or model output, end-to-end data management needs to be planned and budgeted.
-
All data that are well documented are of known quality, and represent systematic collections or characterizations of the state of the environment should be archived in their most primitive useful form.
-
Decisions not to archive data permanently should only occur when the original and predicted purpose of the data has been satisfied, or when the cost of storing the data exceeds the cost of regeneration,1 and should be made in collaboration with the appropriate user communities.
-
Metadata that completely document and describe archived data should be created and preserved to ensure the enhancement of knowledge for scientific and societal benefit.
-
NOAA’s archival process should be designed to allow the integrated exploitation of data from multiple sources to answer environmental questions and support the total life-cycle
-
aspects of individual data sets. This could potentially be accomplished through a distributed but federated archival system facilitated via a single user portal.
-
Broad community representation is essential to establish the process whereby data proposed for archiving can be evaluated and prioritized in terms of scientific and societal benefits.
-
Scientific data stewardship should be applied to all archived information so it is preserved, continually accessible, and can be supplemented with additional data as discoveries build understanding and knowledge.
These preliminary principles and guidelines are intended to help NOAA and its partners begin planning specific archiving strategies for the data streams they currently collect as the Committee prepares for the next phase of its work. To effectively implement these ideas, NOAA will need to collaborate closely with its user communities and with other government agencies (e.g., the National Aeronautics and Space Administration (NASA) and the United States Geological Survey (USGS)). NOAA will also need to work with international partners through organizations such as the Group on Earth Observations, Committee on Earth Observation Satellites, International Council for Science, World Meteorological Organization, International Oceanographic Commission, and International Hydrographic Organization to develop internationally agreed standards and protocols to ensure that key data sets can be accessed and exchanged.
During the second phase of its activities, the Committee plans to convene a user forum designed to engage both NOAA data managers and NOAA’s user community in order to gain additional information and insights on effective data access strategies. Following the forum, and after considering additional materials and extensive deliberation, the Committee will release a final report. This document will include an expanded set of principles and guidelines, illustrated with examples, that NOAA and its partners can use to identify the observational and generated data that should be preserved indefinitely versus those that require only limited storage lifetimes or can be readily regenerated from archived first-stream input, and also the extent to which a wide variety should be made available. A more extensive discussion of the specific scientific requirements for data access and data stewardship, including climate change detection and analysis, will also be included, as will further discussion of funding issues, both in general and in the context of specific archiving and access strategies for individual data streams.