Backups and Disaster Recovery

Storage and computer technology is not flawless.

During operation you are faced with several threats that can put parts or all of your data at risk. Threats range from device failures, viruses, theft, human error and malicious damage to software bugs, data corruption during transfer or when data is at rest, equipment damage by voltage surge, fire, water or other environmental hazards.

A good data backup strategy is at the core a business continuation plan that protects you and your company from dissaster. This article outlines tools and concepts you can use to put together an efficient and reliable solution.

a **primary copy** of your data, that is, a single collection of files for every data set. You may organize data sets by project, by business purpose, by source computer or even by date. Every version of every file you create in your workflows needs to become part of this primary copy. Without a notion of primary copy you cannot have an organized backup strategy.

The 3-2-1 Rule

Follow the 3-2-1 rule: always keep three copies of your data on two different storage media and make sure one copy is off-site.

Any important file you cannot easily recreate from other data should be stored in three copies, one primary and two backups. That way you make sure that when your primary storage fails and even if one backup copy is corrupted, you are still able to restore your data.

Keep the two backups on two different media types such as hard drives, optical media or magnetic tape. Having one read-only or offline medium in the mix prevents viruses and other malware to attack this backup copy. Using different technologies also ensures you do not experience the same technology-related problem in both backups.

Physically separate one backup copy from your on-site operation, the longer the distance,the better. This protects you from theft, natural dissaster and environmental hazards such as damage by fire, water and electrical surges in one location.

Backups in a workflow

Different stages in a production workflow require specific ways of handling backups.

Capture: During capture it's often not possible to generate automatic backups. When capture media fail or get corrupted before data is transferred there is a risk of loosing some or all of the data. Most cameras do not even store checksums on a frame or file level. It's impossible to tell whether data read from a capture drive is what was actually captured. Some professional cameras accept a second capture drive or output data over a live connection which should be used if retake is not possible. Otherwise it's important to regularly monitor the health of capture drives and offload data as soon as possible.

Ingest: When data is offloaded from a capture drive for the first time, an automatic backup of checksummed files should be performed. Capture drives should not be cleared before a quality control step has confirmed the integrity of the copy. It's best to use a software like Pomfort Silverstack, Shotput Pro or Carbon Copy Cloner which creates automatic checksums for each file. You need these checksums later if you like to verify that files have not been silently corrupted at rest or in transit.

Work: Backing up work in progress is difficult because a workflow may spread over multiple independent workstations and changes to files may happen frequent, thus generating many versions per day. Application software should be configured to auto-save progress often. Unless work-in-progress files reside on a central network storage, it's best to create incremental backups for each workstation, for example using software like Time Machine on OSX. For data stored on NAS/SAN it's possible to generate central backups and snapshots of storage volumes. Snapshots are particularly useful because they freeze a state in time across all files on a shared storage volume. That way files you change or delete will while the backup runs will still be copied. Snapshots can serve as starting point for an off-site backup or as quick restore/recover points when data was accidentally deleted or corrupted. Modern filesystems like btrfs and SANs and some NAS support snapshots out of the box.

Archive: All business and media data that is supposed to be kept for long-term should get a full 3-2-1 backup. Make sure you also backup copies of operating system versions and application software you've used in your projects. Future software updates may remove the ability to read old versions of file formats. Don't forget to backup software licenses as well. A good place for archival backups is cloud storage and offline media such as optical discs and tapes. Make sure archives are encrypted if confidentiality is important and keep the decryption keys in a safe and accessible place.

Disaster Recovery

The purpose of a disaster backup is to make sure your business operations are interrupted for a short time only. Restore procedures must be easy can clear, the backup must contain all relevant data and applications to continue operations on new hardware and the backup must be quickly accessible when needed. Disaster recovery backups also require some form of physical separation regardless of the backup media. A good place is cloud storage because of price and availability. When creating a disaster backups it's important to make sure that only known good versions of files are included.

Data that should be part of a disaster recovery backup:

  • operating system installation files and updates
  • application program installation files
  • device drivers
  • software licenses
  • application, system and network settings
  • passwords, certificates and other login credentials

Have realistic expectations about time and costs of accessing an off-site backup and restoring data from tape, optical media or the cloud. Document your backup strategy and restore procedure. Use checklists that are easy to understand by all staff members, print them on paper and keep copies off-site. Regularly check your restore procedure and learn how long it takes. A successful backup does not mean restoring data is possible at all or that all the data you actually need is in the backup.

Related Articles: