are good examples of these efforts. The upper management of the archives at the science centers has embraced this direction, and the archives are currently implementing medium-term measures to achieve VO standards.


If the current trend to strong collaboration continues, the archives supported by the science centers will form a homogeneous, easy-to-use system that is integrated from a user’s perspective. Each wavelength regime, however, will retain its own responsibilities for the long-term curation and preservation of the expertise. Such a system of archives needs to be sustainable. What does this sustainability imply? The committee concludes that archives have to form a system that does the following:

  • Provides services that tap resources across the whole community, not just those from one center;

  • Facilitates the adaptation of community-wide standards for data and services;

  • Provides a mechanism for collaborating on and sharing broadly useful software with other archives and with the astrophysics community;

  • Provides data, software, standards, and documentation;

  • Offers, on a regular basis, tools to teach users and developers; and

  • Supports international access to data and services.

Further discoveries will stem from the analysis of multiwavelength data sets. As access to remote data improves and user-friendly tools to support multiwavelength analysis become available, more astrophysicists are expected to rely on these archival data sets on a daily basis. Such a likely outcome will bring additional challenges and raised expectations. Reliability of the data archives will be crucial because more of the community’s research will depend on it. Performance will also be critical when users expect to get their data in seconds rather than hours.

Data curation and provenance are notoriously labor intensive and might present the biggest challenge. As data processing evolves and the archives store derived data sets, possibly from the combination of multiwavelength data (see mention of GOODS above), it will be increasingly important to track the processing trail of the derived products. In a world of more and more data, finding the relevant data sets and assessing their quality and reliability will be also increasingly important, so that the continuous evolution and curation of data—even old mission data—become crucial. Centers could take on more active roles in recent efforts to move data analysis software to the next level, in which a universal (common and distributed) analysis infrastructure supports many instrument-specific applications, some which are developed in the community and some at the center.

As software technology evolves, it is expected that it will be progressively easier to calibrate the data as they are accessed, guaranteeing the most up-to-date version for everyone. Calibrating data as users extract a given data set will require increasingly more computational resources to be co-located with the archives. This will expand the level of services that the archives will be asked to provide.

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement