Some data are specially measured during the experiment and other are extracted from larger ongoing operational data collection efforts. Packaging these together creates problems of redundancy for data taken from operational efforts. A program to examine cloud radiative processes, called FIRE (First ISCCP Regional Experiment) integrated data sets from satellite, ship, aircraft, and land-based sources. Four experiments in the last eight years each produced about 5 gigabytes of data, a significant fraction of which was operational data. With this quantity of data, redundancy in an archive is not a significant problem. However, the recently initiated Atmospheric Radiation Measurement (ARM) project of the Department of Energy plans to assemble data sets with tens of gigabytes, much of which is part of other larger data sets.
There are many different types of paleoclimatology data. Such data include tree-ring widths; gas concentrations, isotope ratios, and dust levels in ice cores; sediment-core analyses; paleoflood-stage indicators; fossils; and even the contents of ancient pack rat middens. After measurements are made on the samples collected, the original samples typically are stored for very long periods of time. Most measurements are very labor intensive. As a result, paleoclimate data sets are generally very small.
Paleoclimate data are almost all collected originally for research purposes. However, data on events long past also can have operational value. In vital studies such as the determination of the magnitude of extreme precipitation events for dam licensing or utility siting and design, the value of the limited historical precipitation record can be enhanced greatly by geomorphological indicators of paleoflood stage and sediment deposits.
It is usually possible, in principle, to recreate data similar to existing or lost paleoclimate data, but paleoclimate studies are very expensive. Most of the work in the United States is government funded. However, because samples come from all parts of the world, including very remote regions, many projects are cooperative international efforts. The data sets are usually small, making it possible for most of the data to be contained within papers in the open literature. There is an effort to build a centralized archive at the National Geophysical Data Center in Boulder, Colorado, which is the primary World Data Center for Paleoclimatology Data Sets. However, some of original data can still be found only in the custody of the investigators who prepared them. A small number of very prominent data sets have been made available from data centers or over computer networks.
Laboratory data relevant primarily to the atmospheric sciences are fairly limited. The nature of these data and their archiving problems are similar to those covered by the Physics, Chemistry, and Materials Sciences Data Panel in this volume. The relevant laboratory data are largely of two types of work: experiments attempting to simulate some atmospheric phenomenon and experiments to develop a measurement technique or sensor. Laboratory thermodynamic data are essential for atmospheric research and operational activities, but these are largely not determined or archived by atmospheric scientists.
Recreation of most laboratory data is possible. However, some data related to sensor response of operational measuring devices may be nearly impossible to reproduce once the sensors are no longer made and parts have degraded. Sensor response data are an important part of the metadata for operational measurements. There is no organized archiving of laboratory data for the atmospheric sciences. Summarized forms of these data can usually be found in the open scientific literature.
Meteorological and other atmospheric data are used for various purposes on different time scales. It is convenient to delineate three: (1) real-time or current, (2) recent past or short-term retrospective, and (3) distant past or long-term retrospective. Compared with other disciplines, meteorological data are probably used by a much wider segment of the U.S. population than other scientific data sets because they are related directly to many practical concerns of nearly everyone on a daily basis. Other kinds of geophysical data (solid earth, upper atmosphere, oceans, etc.) are more likely to be used by the scientific community or technicians, but there is a large lay audience for weather and climate information.