DATA



Local Mass Storage

Local Mass Storage and it's interfaces in a short.

Direct attached storage encompases all digital storage media directly connected to a computer's internal and external interfaces. This type of storage is exclusive to a single host and cannot be attached to a second host at the same time or shared without using additional software. For improved flexibility and robustness individual drives may be encapsulated by a storage management layer, sometimes called volume or array manager, that conveys a virtual disk volume to layers above.

There are multiple I/O technologies for connecting raw storage media such as SATA, SAS, and FibreChannel. SAS and FibreChannel drives and host adapters provide higher performance, but are also more expensive. They are used in servers and large disk arrays where many I/O operations per second (IOPS) are required, such as large business databases. For high storage capacity and streaming media access consumer-grade SATA drives are already sufficient. 

SATA

Serial AT Attachement (SATA) is a consumer-grade I/O standard for mass storage devices used in almost all laptop and desktop computers today. SATA allows only a simple point-to-point topology, hence each storage device needs a separate physical connection to the host adapter. It is possible to use SATA multipliers for up to 15 devices on a single port, but performance would seriously degrade. Cables are limited to 1m length. External SATA (eSATA) has more robust connectors and shielded cables allowing for up to 2m length. Since SATA 3.0 a mode called Native Command Queuing (NCQ) supports isochronous transfers for streaming digital content at a predictable quality of service. 

SAS

Serial attached SCSI (SAS) is an enterprise-grade point-to-point I/O standard for  storage devices. SAS contains many professional features that SATA lacks, such as better error recovery, drive identification, a more flexible connection topology, failover support, longer cables, server backplane support and full-duplex links. SAS enabled devices and storage arrays can be connected via expanders and switches which allows a topology to spawn multiple racks in a data center. 

FibreChannel

FibreChannel (FC) is a high-speed network standard for storage arrays. Historically used in supercomputers FC has become the technology of choice in storage area networks (SAN). Similar to Internet protocols FC is multi-layer network stack that supports flexible topologies (point-to-point, loops, switched fabrics) and different physical media (optical fibre and copper cables). Link speeds range from 1 to 20 Gbit/s and a FibreChannel network can span between a few meters up to 10km distance. 

Storage virtualization

For the benefit of flexibility it is possible to create virtual or logical storage volumes from multiple raw disks. Such volumes appear to file systems as regular disks, but an operator can grow, migrate and quickly snapshot an entire volume without involving the file systems or any applications above. Virtual storage management is available as add-on service for all modern server operating systems and is already integrated into newer file systems (ZFS, Btrfs) and object stores (Ceph and OpenStack Swift). 

RAID

Another storage virtualization technique is a redundant array of independent disks (RAID). RAID combines multiple disk drives for the purpose of increased availability. Several configurations, called RAID levels, exist where different amounts of drives may fail before data is lost. Most RAID levels also offer an improved read-performances since data is automatically spread across disks. RAID-5 allows one drive to fail, RAID-6 allows two and RAID-7 allows up to three drives to fail before you are at risk of loosing data.

  • JBOD: spanning (not a RAID level), concatenates different sized drives, none can fail
  • RAID-0: striping, joins capacity, distributes load, minimum 2 drives, none can fail
  • RAID-1: mirroring, copies data, optimizes read, minimum 2 drives, n-1 drives can fail
  • RAID-5: striping with parity, minimum 3 drives, 1 drive can fail
  • RAID-6: striping with double parity, minimum 4 drives, 2 drives can fail
  • RAID-7: striping with triple parity, minimum 5 drives, 3 drives can fail

RAID may be implemented in hardware controllers or in software. Dedicated hardware performs better because I/O is offloaded from the CPU to the controller. Some RAID controllers contain extra memory for buffering data and a battery to prevent data loss on power failures. However, hardware controllers are intransparent about on-disk data layout limiting repair options when an array fails. Software RAID is more transparent and flexible, allowing for better troubleshooting. Disks for hardware RAID must be directly attached to the controller, whereas software RAID can uitilze any internal or external disks across controllers and sever backplanes.

Although RAID-5 is widely used it is critizied for inherent weaknesses. Rebuilding a RAID-5 array requires reading all data from all disks. With growing disk drive sizes the rebuild time of a failed array can extend to several days or weeks. During that time an array is less performant and vulnerable to a second drive failure. This is increasingly more likely with larger drives (> 1TB) because the Bit Error Rate does not scale with drive capacity. Therefore, vendors advice against the continued use of RAID-5 for business critical data and instead suggest RAID-6 and RAID-7.

To prevent sudden failure, it's beneficial to perform peridic data scrubbing. Scrubbing is the process of reading and checking all blocks of a RAID array for consistency. When bad blocks are detected they are internally replaced with spare blocks, while the array remains active and usable during the process.

Mass Storage Interfaces

  • SATA 3.0 Year: 2009 Max. Throughput: 6 GBit/s Cable Length: 1m (2m eSATA)
  • SATA 3.2 Year: 2013 Max. Throughput: 16 Gbit/s Cable Length: 1m (2m eSATA)
  • SAS 1.0 Year: 2005 Max. Throughput: 3 GBit/s Cable Length: 10m
  • SAS 2.0 Year: 2009 Max. Throughput: 6 GBit/s Cable Length: 10m
  • SAS 3.0 Year: 2013 Max. Throughput: 12 GBit/s Cable Length: 10m
  • Fiberchannel (FC) 4GFC Year: 2003 Max. Throughput: 4 GBit/s Cable Length: up to 10km
  • FC 8GFC Year: 2008 Max. Throughput: 8 GBit/s Cable Length: up to 10km
  • FC 16GFC Year: 2009 Max. Throughput: 16 GBit/s Cable Length: up to 10km
  • FC 32GFC Year: 2012 Max. Throughput: 32 GBit/s Cable Length: up to 10km