2
Commonalities Between Requirements for the ERA and Requirements for Other Activities
The committee heard repeatedly in briefings by NARA staff that NARA’s requirements are unique. This perspective is found in numerous articles, position statements, and requirements documents related to the ERA program.1 It is based on the idea that special requirements apply to archival institutions, that NARA plays a unique role in the federal government, and that the scale and diversity of the government’s programs exceed those of any other entity. NARA does have unique statutory responsibilities for the management of federal government records and the identification and preservation of records of long-term value. However, its situation is in many respects not all that different from that of many other organizations that also have mandates to preserve digital information.
It would be a mistake to start with the position that (1) NARA’s requirements are unique and (2) preserving records is fundamentally different from preserving other types of information. Such an assumption would limit NARA’s potential to benefit from common solutions to archiving challenges found throughout the federal government and other organizations with long-term preservation needs. State and local governments face many of the same problems, as does the private sector. Moving away from the position that NARA’s requirements are unique would also open up many possibilities for closer collaboration between NARA and other organizations with similar archiving challenges. The benefits of common solutions and collaboration would include these:
-
Freeing up development resources to tackle more exceptional problems. To the extent that NARA can make use of common (including off-the-shelf) options for such components as
1 |
See, e.g., Kenneth Thibodeau, 2001, “Building the Archives of the Future,” D-Lib Magazine, 7(2), February. Available online at <http://www.dlib.org/dlib/february01/thibodeau/02thibodeau.html>; and NARA, 2002, Electronic Records Archives Feature List, ERA Program Office, NARA, October 31. |
-
storage and data management, resources will be available to solve more specialized problems such as ingest of and access to unusual record formats.
-
Reducing development costs. To the extent that commonalities are identified, NARA’s development costs can be reduced by increasing the potential size of the market for archiving system components. Development costs would also be reduced by collaborative work on common tools (e.g., format conversion, automatic ingest, and metadata extraction and markup), standards (e.g., metadata standards and key interfaces to modules), and other technologies for preservation.
-
Easing ingest. To the extent that NARA can build a repository on common standards, it will make it easier for federal agencies to deposit digital materials in a NARA repository or for NARA to harvest records worthy of preservation.
-
Facilitating interoperability. Adoption of common standards by NARA and other institutions will facilitate the development of federated collections by third parties and enable users to employ common tools across multiple digital repositories.
-
Transferring benefits from NARA’s experience and investments to a larger community and vice versa. NARA’s participation in developing and disseminating information about the OAIS model and its work with the group developing the Metadata Encoding and Transmission Standard (METS) are good examples.
-
Sharing technical expertise and knowledge. Both formal and informal mechanisms can be used to tap outside knowledge of and experience with digital archiving. As discussed in Chapter 6, IT expertise is critical to the success of the ERA program.
-
Identifying and recruiting new IT talent. More collaboration will increase the opportunities to identify new IT talent to design and implement the ERA program. For example, a number of graduate students are being trained in digital libraries.
Recent research and development activities in digital preservation have emphasized defining archiving problems in as generic a way as possible and seeking solutions that are common to the many types of organizations with long-term preservation needs.2 The collection-based persistent archive model developed at SDSC (with support from NARA), for example, seeks solutions to digital preservation by integrating archival storage technology from supercomputer centers, data grid technology from the scientific community, information models from digital libraries, and preservation models from the archival community. The SDSC persistent archive prototype (discussed in Chapter 3) aims to support long-term preservation of collections from scientific data repositories, large digital libraries, and archives with the same architecture. NARA also contributed to the development and dissemination efforts of
2 |
See, for example, Commission on Preservation and Access and Research Library Group (RLG), 1996, Preserving Digital Information: Final Report and Recommendations, RLG, Mountain View, Calif., May, available online at <http://www.rlg.org/ArchTF/>; CEDARS Metadata Standards, available online at <http://www.leeds.ac.uk/cedars/index.html>; RLG-OCLC Working Group on Digital Preservation Metadata, 2002, Trusted Digital Repositories: Attributes and Responsibilities (an RLG-OCLC report), RLG, Mountain View, Calif., May, available online at <http://www.rlg.org/longterm/repositories.pdf>; and various documents from the Digital Preservation Coalition, available online at <http://www.dpconline.org/>. |
the Consultative Committee on Space Data Systems, which developed the OAIS reference model, which is now an international standard.3 The OAIS reference model provides common terminology and a high-level framework for an archive.4
Many other groups, including these, are working on long-term digital preservation issues:
-
Library of Congress (LoC). Through the National Digital Information Infrastructure and Preservation Program, LoC is developing a repository for LoC’s own digital collections (both born-digital and turned-digital); developing standards and mechanisms for the exchange of preserved digital objects; and working toward mechanisms for interoperability among digital repositories. Like NARA, LoC is seeking more efficient methods for acquiring digital collections in many different formats, working to identify metadata standards for access and intellectual property rights management, and seeking technology for a repository system.
-
National archives and libraries in other countries. The Netherlands’ Koninklijke Bibliotheek (KB) has built on the work of a European collaborative project, Nedlib, to build a system to store KB’s collection of born-digital documents.5 IBM is implementing the system drawing on a number of off-the-shelf products.
-
Digital library researchers. This research community addresses issues such as storage and data management of large amounts of information in diverse formats and media (e.g., text, images, video, music, recorded speech); search within a single archive using controlled-vocabulary and/or content-based indexing; federated search across collections, including collections operated by different organizations; metadata conversion; format conversion; resource (archive) selection; and interoperability among archives.
-
Digital library operators. Much practical experience has been gained by the operators of such digital libraries as the California Digital Library, JSTOR, the National Library of Medicine, and the Library of Congress.
-
Federal agencies. Some federal agencies, including the National Aeronautics and Space Administration, the National Institutes of Health (especially the National Library of Medicine), the Department of Defense, and the intelligence agencies already manage large digital repositories.
-
State and local governments. State and local governments also maintain archives and thus face long-term preservation challenges quite similar to those of NARA.
-
Private sector. Online text retrieval services such as Lexis-Nexis and WestLaw maintain very large digital libraries. A number of industry sectors, such as pharmaceuticals, public utilities, aviation, and those using hazardous materials, must retain very large collections of records for the long term to fulfill regulatory requirements.
3 |
Consultative Committee for Space Data Systems (CCSDS). 2002. Reference Model for an Open Archival Information System (OAIS). CCSDS 650.0-B-1 (Blue Book). CCSDS Secretariat, National Aeronautics and Space Administration, Washington, D.C. January. Available online at <http://wwwclassic.ccsds.org/documents/pdf/CCSDS-650.0-B-1.pdf>. |
4 |
The model does not provide the specifications for a particular architecture or an implementation framework; it does define core functions as ingest, archival storage, data management, access, administration, and preservation planning. |
5 |
In summary, NARA’s requirements for many aspects of the ERA—such as archival storage, data management, preservation, administration, and preservation planning—are surprisingly similar to the requirements of any other organization preserving digital data for the long term. NARA has much to gain by starting from the premise of common requirements in these areas first and then clearly specifying where its requirements are different or unique.