BOX 5.1 Some Possible Elements of a Digital Record

  1. Original files: one or more digital files, the original bit stream or native form of the record represented in its native data type. A record may consist of more than one file, e.g., a report in which each chapter is represented by a separate word-processor file.

  2. Optional derived forms: digital files obtained from the original files by a converter. ERA policies might encourage certain kinds of derived forms to be saved, for example these:

    • A form whose data type is chosen to simplify “presentation” or “rendering” the record into a visible form for printing or display.

    • A form whose data type is chosen to simplify content searching.

  1. Metadata for the record. In addition to the usual metadata normally captured for records, there is additional metadata associated with electronic records, such as:

    • Data types of the digital files and derived forms, with sufficient information to allow finding documentation about the data types. Data types will usually be versioned.

    • Relationship among the digital files that constitute the record.

    • Integrity checks—e.g., a cryptographic hash, for the digital files and the metadata.

    • Ephemeral/nonderivable metadata—i.e., properties of the context in which the record was created that are not specified in the record itself.

    • Derived metadata—i.e., properties that have been extracted from the record.

    • Provenance and history, such as evidence that it was transferred accurately to the archive (a form of ephemeral metadata). In the case of derived forms, metadata identify the converter used to obtain the derived form from the original.

    • Unique identifier of the record.

    • The data type of the metadata—i.e., the definitions of metadata elements used to construct the metadata for this record.

a separate file. Metadata that pertain to collections as a whole might be stored in yet another file, referenced in metadata for each record in the collection.

The data model must also deal with embedding and aggregation. Embedding occurs when one record is embedded within another, e.g., a spreadsheet is embedded as an attachment within an e-mail message. Aggregation occurs when several records are saved together, e.g., a series of e-mail messages is saved in a single file, even though each message is to be treated as a separate record. Another form of aggregation that may be desirable is the container—e.g., as used by the SDSC demonstration—which simply collects a group of records into a single digital file for more efficient handling by the file system.2

The archive should contain complete documentation about all versions of the data model, including specifications of the data types it uses. Since metadata sets are likely to proliferate

2  

A container is distinct from an archivist’s “collection.” A collection may span several containers, and several collections might fit within a single container.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement