• Data are shared and not duplicated;

  • Each agency fulfills its responsibility for quality controls; metadata structures; documentation of analysis, forms, and systems designed to process the data; and production of data products and development of services and mechanisms for making the data available and usable by the scientist and nonscientist alike; and

  • Each agency participates in electronic networks that enable access, sharing, and transfer of data.

Documentation Requirements

To effectively use space science data well into the future requires comprehensive and accurate documentation of the program that generates the data, the data themselves, the analyses carried out using the data, and the system that maintains and stores the data. Documentation should follow the project life cycle, and begin with the initial plans for a project or mission. In the case of NASA, this results in a NASA Data Management Plan (NDMP), which defines the data management considerations at each of the stages of the life cycle: collection; processing; analyzing and peer reviewing; reprocessing, reformatting, and reanalyzing; storage; primary use; secondary use; and final disposition (archive permanently or destroy). For a comprehensive discussion of metadata requirements and framework relevant to all observational data, see the report of the Ocean Sciences Data Panel.

4 SUGGESTED RETENTION CRITERIA AND APPRAISAL GUIDELINES

Retention Criteria

The panel identified the following retention criteria, ranked in priority order, based on its discussions and a review of the literature.

  1. Significant value of the data

    Do the data contain fundamental information that will be of use to future researchers or future national programs? A consideration for retention of a data set is whether the data have resulted in significant scientific return. That is, have they been used in scientific analyses? A negative answer to this question will require a decision as to whether there is potential for future use of the data. In determining this, there are likely to be considerations unrelated to the primary research value, including whether the data document important characteristics of the program and the mission of the agency that produced them.

  2. Adequacy of documentation

    Do the data sets have accompanying documentation containing data formats, conversion factors to physical units, and error assessment information? Are there ancillary data with dates, times, ephemerides, etc? Are the necessary software and algorithms included?

  3. Cost of replacement

    Could the data be reacquired if a future national need for the data would arise? If so, would the data be costly to reacquire relative to the costs of preservation?

  4. Uniqueness of data

    Do the data exist in an accessible repository that meets NARA standards of permanence and security? If so, are they adequately backed up?

  5. Peer review

    Has the data set undergone a formal peer review to certify the integrity and completeness of the data set, or is there documented evidence of use of the data set as leading to publication of results in peer reviewed journals? Have expert users provided evidence that this data set is as described in the documentation?

Ancillary Recommendations
  1. As much scientific data as possible should be preserved.

    In reviewing data for retention, no arbitrary percentage of available information should be regarded as an adequate archive.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement