level-0 data is greater than that for higher levels of data analysis and aggregation, and whether this value offsets the costs of maintaining all levels of data.
Many of the procedures currently followed in appraising other records are valid for scientific data. Descriptions of the data, the systems that contain the data, the project that collected the data, and the purposes for the project all add to the knowledge necessary to determine informational value of a record. Documentation of the data and the system hardware and software also are essential to the future use of the data and must be considered in the appraisal process. Sample outputs of the systems, useful in appraising other electronic records, may not be as valuable in determining the informational value of scientific records unless the reviewer is a scientist. The following scenario may be useful in defining how the appraisals may be carried out.
Identify the data to be collected in the scientific missions or investigations; prepare a data management plan; name a data manager; and define information systems that can adequately accept, process and store the data. Involve the agency's records manager in the planning process.
Describe the data to be collected and the proposed disposition in an SF-115 form and send this to NARA to approval.
In the initial analysis of the data, focus on the success or failure of the project, the quality and completeness of the data, and their usability. Determining the validity of the data collection is in itself an appraisal process.
Process, organize, and fully document the data to make them accessible to primary users. Peer review procedures can assist the scientists and the data managers in determining whether the data have received careful and accurate analysis and need to be retained.
Appraise any data sets resulting from additional analyses.
Complete the appraisal of data when they are in the agency's “deep archive.” The location of the final archival set should be determined at this time.
The panel recommends that the initial appraisal of scientific data sets be made by the investigators, the information systems manager, and the agency's records manager. The agency's data manager, together with experts in the subject area, should verify the disposition determinations during peer review and as analysis of the data takes place. For larger data collection efforts that involve two or more agencies, an interagency team should review the data to determine their correlation with other agency data.
As scientific projects or missions are completed, procedures for data disposition are applied. Data are moved to archives at the agency level or to more centralized data repositories, such as NOAA's data centers, serving a number of agencies. Appraisal decisions made at this time may affect the data's accessibility to researchers in generations to come. Thus, NARA, and possibly the advisory scientific data committee, should be brought into the review and decision process at this point. Based on the review at this time, earlier tentative disposition decisions may be overridden. Also at this time, NARA may decide to accept the records for storage, or determine that a designated satellite archive would be more appropriate for long-term or permanent retention of the data.
Most space science data collected by space-based instruments may be presumed to be federal records. Ground-based data generally are not federal records.
NARA currently has no digital space science data, but needs to develop a plan if it is to undertake the long-term retention of such data. Space science data are managed and archived in a distributed system, principally by the users of the data. The panel concludes that it is impossible for NARA to maintain and adequately service all archivable scientific databases and data sets for the foreseeable future.
The policies and priorities relating to data archiving are highly variable across and even within agencies. There is concern that even agencies (e.g., NASA) with a clear charter to obtain scientific data and to make those data widely available are not fully meeting this responsibility.