Glossary

This glossary was developed by Library of Congress Junior Fellows and released in 2013 in support of NDSA’s work on Levels of Preservation, as a means of providing a succinct definition of terms of special value to the NDSA and its extended digital stewardship community. Learn more about the development of the glossary here.

A set of Working Definitions of Terms was created to support the 2019 Levels of Digital Preservation.

Term Definition

Archival original

Use instead "Received Version"

Authenticity

A mechanical characteristic of any digital object that reflects the degree of trustworthiness in the object, in that the supportive metadata accompanying the object makes it clear that the possessed object is what it purports to be.

Backup

Additional copies of a digital asset made to protect against loss due to unintended destruction or corruption of the primary set of digital assets.The essential attribute of a back-up copy is that the information it contains can be restored in the event that access to the master copy is lost.

Bag

A package of content that conforms to the BagIt Specification (specification available at http://www.digitalpreservation.gov/documents/bagitspec.pdf). Under the specification, a bag consists of a base directory containing a small amount of machine-readable text to help automate the content's receipt, storage and retrieval and a subdirectory that holds the content files. Sea also "Bagit Specification" and "Bagger."

Bagger

A graphical software application tool to produce a package of data files that conforms to the BagIt Specification.See also "Bagit Specification" and "Bag."

BagIt Specification

An Internet Engineering Task Force (IETF)Internet-Draft specification for a hierarchical file packaging format for the storage and transfer of arbitrary digital content. Specification available at http://www.digitalpreservation.gov/documents/bagitspec.pdf. See also "Bag" and "Bagger."

Best Edition

The edition of a work, published in the United States at any time before the date of deposit, that the Library of Congress determines to be most suitable for its purposes.

Bit Preservation

A baseline preservation approach that ensures the integrity of digital objects and associated metadata over time in their original form, even as the physical storage media which houses them evolves and changes. Also known as "bit preservation."

Canonical

Use instead "Preservation Copy"

Chain of Custody

A process used to maintain and document the chronological history of the handling, including the transfer of ownership, of any arbitrary digital file from its creation to a final state version. See also "provenance."

Checksum

An algorithmically-computed numeric value for a file or a set of files used to validate the state and content of the file for the purpose of detecting accidental errors that may have been introduced during its transmission or storage. The integrity of the data can be checked at any later time by recomputing the checksum and comparing it with the stored one. If the checksums match, the data was almost certainly not altered. See also "Fixity Check."

Derivative

A transformed version of an original source file, often called a "service," "access," "delivery," "viewing" or "output" file, used to facilitate access to or additional use of the content.

Digital Content

Any arbitrary item created, published or distributed in a digital form, including, but not limited to,text, data, sound recordings, photographs and images, motion pictures and software. Used interchangeably with "Digital Materials."

Digital materials

Any arbitrary item created, published or distributed in a digital form, including, but not limited to,text, data, sound recordings, photographs and images, motion pictures and software. Used interchangeably with "Digital Content."

Digital object

A conceptual term that describes an aggregated unit of digital content comprised of one or more related digital files. These related files might include metadata, derivative versions and/or a wrapper to bind the pieces together.

Digital preservation

The series of managed activities, policies, strategies and actions to ensure the accurate rendering of digital content for as long as necessary, regardless of the challenges of media failure and technological change.

Digital signature

A method to authenticate digital materials that consists of an encrypted digest of the file being signed. The digest is an algorithmically-computed numeric value based on the contents of the file. It is then encrypted with the private part of a public/private key pair. To prove that the file was not tampered with, the recipient uses the public key to decrypt the signature back into the original digest, recomputes a new digest from the transmitted file and compares the two to see if they match. If they do, the file has not been altered in transit by an attacker. See also "Checksum" and "Fixity Check."

Emulation

A means of overcoming technological obsolescence of hardware and software by developing techniques for imitating obsolete systems on future generations of computers.

File format

Packages of information that can be stored as data files consisting of a fixed byte-serialized encoding of a specified information model, and/or a fixed encoding of that encoding in a tangible form on a physical storage structure.

Fixity check

A mechanism to verify that a digital object has not been altered in an undocumented manner. Checksums, message digests and digital signatures are examples of tools to run fixity checks. Fixity information, the information created by these fixity checks, provides evidence for the integrity and authenticity of the digital objects and are essential to enabling trust. See also "Checksum" and "Digital Signature."

Format Migration

A means of overcoming technical obsolescence by preserving digital content in a succession of current formats or in the original format that is transformed into the current format for presentation. The purpose of format migration is to preserve the digital objects and to retain the ability for clients to retrieve, display, and otherwise use them in the face of constantly changing technology.

Ingest

The process through which digital objects are added into a managed environment.

Instance

Any particular instantiation of a digital file, object or collection.

Integrity

See "Fixity Check."

Life Cycle

A set of iterative, modular processes that govern the creation, acquisition, selection, description, sustainability, access and preservation of digital content over time.

Metadata: Administrative

Administrative metadata comprises both technical and preservation metadata and is generally used for internal management of digital resources.

Metadata: Descriptive

Metadata that identifies a resource and describes its intellectual content for purposes such as discovery and identification.

Metadata: Preservation

The contextual information necessary to carry out, document, and evaluate the processes that support the long-term retention and accessibility of digital content. Preservation metadata documents the technical processes associated with preservation, specifies rights management information, establishes the authenticity of digital content, and records the chain of custody and provenance for a digital object.

Metadata: Rights Management

Administrative metadata that indicates the copyrights, user restrictions, and license agreements that might constrain the end-use of digital content (including metadata files).

Metadata: Structural

Metadata used to describe the logical or physical types, versions, relationships or other characteristics of content files comprising a complex digital object.

Metadata: Technical

Metadata that describes the technical state of and process used to create a file. Often closely related either to its file format or the original software used to create the file, e.g. scanning equipment and settings used to create or modify a digital object.

Migration

Use instead either "Format Migration" or "Storage Migration."

Organizational Unit

A department, division, directorate, program, sector or other group working to curate and preserve a digital collection.

Package (Noun)

Any arbitrary container of digital data. See "Package (Verb)."

Package (Verb)

The act of creating an arbitrary container of digital data. See "Package (Noun)."

Permissions

The access available to system users attached to specific roles in a computing environment, as well as the mechanism for administering access to a specific object on a computer system. Depending on the system or application, permissions can be defined for a specific user, specific groups of users, or all users; or for a role, or groups of roles; or based on one or more user attributes.

Preservation copy

Digital content targeted for preservation that is considered the master version of the intellectual content of any arbitrary digital resource. Preservation master files may capture additional information about the original beyond the content itself. Because they are created to high capture standards, preservation master files could take the place of the original record if the original was destroyed, damaged, or not retained. Preservation masters generally do not undergo significant processing or editing. Preservation masters are often used to make other copies including reproduction and distribution copies.

Process (noun)

A continuous and regular action or succession of actions occurring or performed in a definite manner, and having a particular result or outcome; a sustained operation or series of operations.

Process (verb)

To register or interpret (information, data, etc.);Computing to operate on (data) by means of a program

Provenance

Information on the origin of a digital object and also on any changes that may have occurred over the course of its life cycle.

Received Version

The primary authentic and unique item, either the original or the closest surviving surrogate or copy, as originally acquired by the Library. See also "Preservation Copy."

Restricted Use

A category of digital content restricted for any number of reasons including copyright restrictions, donor agreements, security clearance, presence of personally identifying information (PII), or simply that the content is intended for internal use only.

Schema

A formal description of a data structure. For XML, a common way of defining the structure, elements, and attributes that are available for use in an XML document that complies to the schema.

Storage Migration

The process of copying content from one generation or configuration of digital data storage onto an updated generation or configuration.

Storage: Archival

The category of digital storage that provides the services and functions for the long-term storage, maintenance and retrieval of digital objects.

Storage: Nearline

A term used in computer science to describe an intermediate type of data storage that represents a compromise between online storage (supporting frequent, very rapid access to data) and offline storage/archiving (used for backups or long-term storage, with infrequent access to data). "Nearline" is a contraction of "near-online." See also "Offline Storage" and "Online Storage."

Storage: Offline

Any digital storage medium that must first be attached to a computing device before being made accessible to the computing system. Offline storage may be in the form of tape drives, fixed media (CDs, DVDs, flash drives) or hard drives that are not continuously network accessible. Also called removable storage. See also "Nearline Storage" and "Online Storage."

Storage: Online

Local or network-accessible storage utilized for data that is immediately accessible to an application without the need to stage it in from a lower tier of storage.See also "Nearline Storage" and "Offline Storage."

Unique Identifier

A string that uniquely identifies an object within an identification scheme.

Validation

The process of making sure that data is correct and useful when checked against a set of data validation rules. These might include rules for package or file structure or specific file format profiles.

Verify

The process of checking a copy of a data file to make sure that it is exactly equal to the original data file, or that a file remains unchanged over time.