DATA



Storage Classification

Overview about Storage for Recording, Transport, Online, Nearline and Offline.

Different kinds of storage systems are used throughout media production workflows. Typically, a storage systems is built on top of some physical storage media (flash or solid state memory, magnetic discs, optical discs) and features a filesystem for unifying access to raw storage capacity and stored data. File systems are specifically designed to operate on a given media technology and may not work or perform poorly on other media. 

Storage performance metrics
The major metrics that describe the performance of storage systems are capacity, latency, reliability, durability, througput and sustained bandwidth. Capacity describes how much raw data can be stored on a medium. Latency is the time it takes to make data available for reading (much longer on magnetic tape than on spinnin disks). Reliability defines the probability of errors when reading data in normal operation while durability describes the longlivety of data on the medium. Throughput describes the number of read and write operations per time (IOPS) when the data size is small (e.g. 4 - 64 kByte blocks). This is important for databases and storing small documents which access storage at such block sizes. Sustained bandwidth on the other hand describes how fast data can be read from or written to storage at a sustained rate over long durations. This is an important metric for applications that generate large files. In some cases, physical dimensions, weight and energy consumption are also important when handling mobile storage media in the field. 

Storage Tiers
Capture storage: digital film and TV cameras generate a very high and often constant raw data rate during capture. Only special flash memory (CFDisks) and solid state discs (SSD) are fast enough to sustain such data rates across the entire disk capacity while still beeing small and light to carry. Storage found in camera magazines uses simple file systems (e.g. FAT32 or exFAT) to keep in-camera software complexity low, even if FAT32 limits file size to 4GB. Capture media is not durable over long time-frames, so data should be copied for processing and archiving.

Shuttle storage: is used for transporting large amounts data between facilities or locations. Shuttle storage is normally a RAID array of multiple SSD or HDD discs installed in a light-weight enclosure. For high-speed data transfers at least a USB-3 or Thunderbolt connector is neccessary. RAID-1 or RAID-5 should be used to prevent data loss. The choice of file system is usually guided by maximum interoperability between facilities. However, all data on shuttle storage should be encrypted to prevent data theft and this is best done at the filesystem level. Unfortunately the most operable file system is often also the most simple one which does not support encryption or data integrity protection.

Online storage: is data storage for frequent and rapid access that is immediatly available for use. This is usually all locally attached storage and networked storage on NAS or SAN a computer can access. In film or TV production all relevant raw and proxy files of an ongoing workflow are stored here. Online storage uses spinning hard discs for quick random access and may use solid state disks for high-speed data streaming or caching of files that are used often.

Nearline storage: is a storage tier that is not immediately available, but can be made available quickly without human intervention. This can be a tape or optical library system where a robotic arm loads cartridges into drives, or a networked storage system that keeps data on idle hard drives that are only spun up when data is requested. Nearline storage is usually much cheaper than online storage because operational costs are low. With decreasing prices nearline storage systems are used in traditional offline storage areas such as for storing backups and keeping content archives.

Offline storage: is a storage tier where data is not immediately available. In order to make data available some human intervention is required. Offline storage is used for long-term storage of backups and archives, but also for data transport (see also shuttle storage).

Cloud storage: is a storage tier where data is stored off-site on systems owned and operated by a 3rd party cloud operator. Clients rent storage capacity like a utility and can scale their usage up and down. Usually a client can select between different storage tiers with online and nearline characteristics. The cloud operator ensures all data is kept available and stored securely, allowing clients to enrypt data at rest and to define custom access permissions. Cloud storage is an economic option for content archives, backups and disaster recovery data, but also for exchanging data between facilities in a production workflow.

Author: Alexander Eichhorn

Related Articles: