Data management is identified by the panel as a critical cross-cutting element of GOALS. This activity refers not only to the management of observational data obtained from various operational and research observing systems and networks, but also to data output (and products) from empirical and diagnostic studies, data assimilation systems, and climate prediction models as well as analysis products and information from the applications and human dimensions sectors involved in GOALS.
Each of the elements of GOALS will generate large multivariate data sets that are potentially useful to a sizable community of researchers. In developing a management strategy for these data sets, the panel recommends that the following principles be taken into consideration to maximize the utility of critical data and information resources:
- free and open access to all GOALS observations, model products, and model code. This is a basic tenet of the GOALS program.
- a distributed data management system. This model provides the most efficient access to existing products and encourages the incorporation of new products.
- The development and maintenance of a GOALS data catalog will be a critical step towards the success of the distributed data base system proposed for GOALS.
- real-time data transmission to the extent possible or feasible. Overcoming the technical and financial barriers that limit real-time transmission of GOALS data is a high priority of the GOALS program.
- Adopt the FGGE classification system for data, data analyses, and information products, namely: Level I designating raw data, Level 2 for quality-controlled data, Level 3 for analyzed fields, and Level 4 for model output and data-information products.
- Metadata and documentation of data processing techniques and dataset evaluation is essential and should be readily available.
Data management structures already exist under the auspices of WCRP on the federal agencies that launch and operate satellites and that routinely process, archive, and distribute the satellite data. The panel recommends that GOALS work with the USGCRP, the World Data Centers, federal satellite agencies, and WCRP data management structures to minimize redundancies and costs and to maximize data distribution in order to promote efficiency in data availability for research and applications.
The management of data from GOALS process studies will require special attention. It is important for the respective project scientists and managers to articulate data management requirements, develop a cohesive plan, and obtain funding and institutional commitments for specialized data management needs.
Each data set should be accompanied by a comprehensive metadata file that documents information about the data, how they were collected, quality controlled, and processed, and their characteristics, resolution, and so forth. In addition, special attention should be paid to:
- changes in observation times;
- changes in the exposure of instruments, station environment, and the effects of urbanization;
- changes in spatial distribution and aerial coverage; and
- changes in methods of processing and analyzing observations.
A comprehensive data management program should also involve the resurrection of existing data sets (i.e., an effort aimed at the retrospective collection/ compilation of missing data in the historical data archives). It is underscored that the total extent and magnitude of the observations made over the past 100 years far exceeds the data currently available for research purposes. While there have been commendable efforts to resurrect past observations from various national and international archives, such efforts need to be continued with adequate funding. These data are particularly important for the investigation of seasonal-to-interannual climate variability. They are even more important for studies on longer time scales. This task is usually underestimated by funding agencies.
Several of the above points and issues that have been proposed for establishment by the panel (NRC, 1995) and reiterated throughout this document, are elaborated in Bits of Power: Issues in Global Access to Scientific Data (NRC, 1997).