Comparing routine backups, content archiving, and long-term digital preservation
Digital information, which relies on complex computing platforms and networks, is created, received, and used daily to deliver services to citizens, consumers and customers, businesses, and to government agencies. Organizations face tremendous challenges in the twenty-first century to manage, preserve, and provide access to electronic records for as long as they are needed.
In the event of a catastrophic event, a business needs a plan for remaining open to minimize loss of capital. Cerullo & Cerullo (2004) noted “a business continuity plan (BCP) seeks to eliminate or reduce the impact of a disaster condition before the condition occurs” (p. 70). A business continuity plan has three functions: contingency, resilience, and recovery.
Although routine system backups, content archiving, and long-term digital preservation all preserve information, they each provide a unique function.
Most people are familiar with backups, which are done daily and often also weekly, where a full copy of daily transactions and system activity is made. This is done in a serial fashion, and usually on tape, so searches must be done linearly making them slow and cumbersome for tasks like retrieval of records during the e-discovery phase of litigation. Also, complex searches using multiple search terms, phrases, or dates are difficult or not possible. But in the event that systems need to be restored, due to a system failure, data breach, or ransomware attack, an entire restoration of the system data can be accomplished for business continuity (BC) purposes.
Content archiving is a relatively newer concept, where all content being created can be archived in real-time. This includes email messages, which are captured, time-stamped, and archived, which preserves evidence and helps to avoid any spoliation (changed or deleted content, after-the fact) claims during litigation.
Four key business functions that any email or content archiving solution must perform, include the need to:
- Ensure archive completeness;
- Provide efficient and reliable long-term storage;
- Ensure security and integrity of content;
- Provide immediate access to archived content for authorized users
Long-term digital preservation (LTDP) is defined as: long-term, error-free storage of digital information, with means for retrieval and interpretation, for the entire time span the information is required to be retained. Digital preservation applies to content that is born digital as well as content that is converted to digital form.
Some digital information assets must be preserved permanently as part of an organization’s documentary heritage. Dedicated repositories for historical and cultural memory such as libraries, archives, and museums need to move forward to put in place trustworthy digital repositories that can match the security, environmental controls, and wealth of descriptive metadata that these institutions have created for analog assets (such as books and paper records). Digital challenges associated with records management affect all sectors of society—academic, government, private and not for profit enterprises—and ultimately all citizens of all developed nations.
Fortunately, a few software and services vendors have developed a cloud-based approach to digital preservation, which makes the process of bringing LTDP expertise in-house much easier, faster, and more economical. This cloud-based approach also assures durability of the digital information, whereby 5-6 copies of any digital document or file are saved on different servers in different parts of the world using major cloud providers such as Amazon and Microsoft. That geographic dispersion helps to mitigate risk associated with a disaster in any particular region. Files are stored in technology-neutral file formats, and the vendor provider takes care of any migrations to newer formats. The veracity and integrity of each file is tested regularly to ensure it has not been corrupted. A checksum algorithm is applied to ensure no changes have occurred at the bit level, and if they have, the error is flagged and a new or updated copy is created.