The decision on where to keep a long-term archive should consider what makes an archive usable and sustainable for the community, beyond the minimal goals of preserving the bytes. The committee describes a sustainable archive as one that

  • Continually facilitates the production of new scientific results;

  • Has a strategic goal to enable more and better science;

  • Contains high-quality, reliable data;

  • Provides simple and useful scientific tools to a broad community;

  • Provides user support to the novice as well as to the power user;

  • Has many diverse uses (and users);

  • Has a core group of users for whom it is an everyday tool;

  • Collects metrics that track usage and science output;

  • Is properly curated (e.g., errors discovered are documented and fixed);

  • Adapts and evolves in response to community input; and

  • Has an adequate mix of developers, scientists, and tech support staff.

In spite of the considerable efforts of archive staff to capture as much of the metadata about the particular instruments as possible, the scientists dealing with the quirks of an instrument over the years will always have a much more intimate understanding of the systematic errors in the data products. It is important for the mission to develop good metadata and documentation to ensure the long-term accessibility and usability of its mission data. NASA astronomy science centers can play an important role over the long term in capturing as much of this knowledge as possible during the mission phase, but they should also strive to retain the knowledge as long as necessary, using the above criteria.

ORGANIZATION BY WAVELENGTH

It is clear that there is a natural migration of older data sets into centralized facilities and that not every mission will (or should) retain its own separate archive. Although many archives specialize in broad wavelength ranges,1 those wavelength distinctions have loosened over time. There are also value-added services such as the Astrophysics Data System (ADS)2 (http://adswww.harvard.edu/) and the NASA/ Infrared Processing and Analysis Center (IPAC) Extragalactic Database (NED)3 (http://nedwww.ipac.caltech.edu/), which link the data sets to the literature. Today many astronomers are using these services several times a day. These archives are of course the primary guardians of the data sets from their main missions, the Hubble Space Telescope (HST) at the Space Telescope Science Institute (STScI); Compton Gamma Ray Observatory, Uhuru, Advanced Satellite for Cosmology and Astrophysics, the Roentgen satellite (ROSAT), and many others at HEASARC; the Einstein and Chandra (http://cxc.harvard.edu/cda/) for the Chandra X-ray Center (CXC); and IRAS, Two Micron All Sky Survey (2MASS), Spitzer, and many others at IPAC.

1

The UV-optical data sets are migrating to the Multimission Archive at the Space Telescope Science Institute (MAST) at http://archive.stsci.edu/; the near- and far-infrared archives are at the Infrared Science Archive (IRSA) at http://irsa.ipac.caltech.edu/, at IPAC; and the high-energy data sets are moving to the High Energy Astrophysics Science Archive Research Center (HEASARC) at http://heasarc.gsfc.nasa.gov/ at the Goddard Space Flight Center.

2

ADS, operated by Harvard and funded by NASA, contains 4.8 million searchable bibliographic records. Full-text scans of many of these records are viewable free via a browse engine.

3

NED, operated by the Jet Propulsion Laboratory and under contract to NASA, contains 14 million names for over 9 million extragalactic objects and over 3.3 million bibliographic references.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement