Box IV.1: General Definitions
Observational Data Set—An observational data set is an aggregate of measurements or observations of environmental properties or the properties of objects in the environment. Measurements are the direct or minimally processed outputs from sensors, and thus are often in engineering units, such as volts. Observations represent the best assessment of the state of the environment expressed in scientific units, such as degrees centigrade. Typically, both measurements and observations are associated with a time and space specification, and are aggregated into data sets on the basis of some common context criteria, such as all salinity observations from the same platform or all surface temperatures for a narrow range of time.
Metadata—A class of data used to describe the content, representation, structure, and context of some well-defined set of observational data. Content metadata define observational data items in terms of description, units of measurement, and legitimate value ranges. Representation metadata define the units of measurement for each observational data item and the physical format for the whole data set. Structure metadata categorize the groupings of data items into logical aggregates, which typically correspond to real-world entities. Context metadata define all ancillary information associated with the collection, processing, and use of the observational data. The metadata may include a quality assessment of an individual observation in a data set and an overall evaluation of the observational data set.
Information Model—An information model is the specification of the objects (things, people, events, concepts) about which one needs to maintain information. The specification should identify and define the objects, important attributes of the objects, and inherent relationships between the objects. The model may be expressed in a formal language or a narrative text with a graphical depiction that follows well-defined notation standards. Examples of pertinent real-world objects in the oceanographic domain include entities such as platforms, instruments, sensors, investigators, sampling plans, processing algorithms, stations, sections, projects, data collection runs, and calibrations.
Environmental data are becoming increasingly important to scientists, policymakers, resource managers, educators, students, and the general public as technology brings them closer to data sets (and their manipulation and display) and because of the relevance of such data to understanding and managing the environment. Considering the importance of promptly providing these users with correct and properly documented data, there is an ever increasing need for the scientific review of oceanographic data, for the active distribution and assistance with the proper use of the data, and for the safe, long-term archiving of the data.
The panel approached the study problem by identifying the items needed to aid the steering committee in synthesizing effective recommendations for the National Archives and Record Administration (NARA) and the National Oceanic and Atmospheric Administration (NOAA). The panel identified four needed items:
Rules for retention or deletion of observational data sets from the ocean sciences, and a mechanism of appraisal that would apply the rules;
A framework that specifies the types of metadata needed to make long-term observational data sets useful to primary, secondary, and tertiary users;
An architectural model to effectively handle observational data throughout its life cycle, from creation to long-term archival storage; and
Recommendations to existing organizations in the data management infrastructure to affect workable long-term retention of oceanographic records in electronic form.
The panel used the task statement provided by the steering committee to help guide its discussion and analysis.
The oceans and atmosphere are turbulent fluids, constantly changing over many spatial and temporal scales. The numerous types of data that describe the oceans are often unrelated to one another, and even those that are