The volume, diversity, and complexity of space science data require that the scientific community, the institutions that collect and manage these data, and the National Archives and Records Administration (NARA) reexamine issues relating to the archiving of such data, both for primary (scientific) use and for secondary (long-term archive) use. Most of the space science data are now created or collected in electronic form and are thus not available for archival use in traditional paper form. Decisions must be made on responsibility for the data and their management, both short- and long-term; the format in which they will be maintained; the criteria that can be applied to decisions on retention; and relationships among the scientific researchers, the scientific agencies and institutions, and NARA.
Space science data are created at the project or operating levels of an organization. They are recorded in human-readable or machine-readable format and may be found in laboratory notebooks, completed forms, tabulations and computations, graphs, or microforms, or in data sets housed on compact disks, on magnetic tape, or in disk storage in micro-, mini-, or mainframe computers.
Space science data are generally categorized as observational, experimental, or derived. Observational data are collected on space missions or observed from the ground, and added to specific data sets in structured or unstructured data management programs. Experimental data are obtained from laboratory or mission experiments and are documented on paper or in computers. Derived data result from the analysis of various data sets or the running of scientific models.
Observational data are a snapshot in time and thus are not data that can be collected again, duplicating what was collected at an earlier time. Experimental data, however, may or may not be possible to duplicate in the future, depending on their purpose, structure, location, and cost. For derived data, it will almost always be possible to duplicate the analysis or model to arrive at the same data set.
The development of an archivable data set requires a successful data management program leading up to the archive. The steps in effective data management include comprehensive data life-cycle planning, data acquisition, quality control and validation, reprocessing, storage, retrieval, and dissemination. Data management services include the maintaining of active databases, documenting algorithm development, providing interactive access to both program data and data from other sources, data handling services, and assistance to secondary users for access to collected data, among other functions.
The two largest collectors and managers of space science data, NASA and NOAA, have established, with varying levels of efficacy, data management programs designed to ensure that the data collected in any manner are available to, and accessible by, the science community. Data management plans were not in place for many of the earlier NASA space missions, and thus the data are unstructured, and in many instances, unusable. In the later missions, however, a data management plan has been required at the initial stages of the mission, metadata requirements have been put in place, and the necessary documentation has been provided.
NASA's former Office of Space Science and Applications issued a revised “Policy for the Management of the Office of Space Science and Applications ' Science Data” in 1992 that still governs the management of space science data. The policy requires a Project Data Management Plan (PDMP), early in the project's life, which addresses the “total flow of research data.” It provides for data management planning throughout the project planning and implementation phases and for the updating of the plan as necessary. In general, the policy mandates that:
Science data generators and users are the primary source of requirements and the final judges of quality and value of the data.
NASA will establish and maintain archives to preserve and make accessible valuable science data and information, including information about the data holdings, and guidance and aids for locating and obtaining the data.
A review process will be established to determine what data should be archived and to assure conformance with completeness and quality standards.