Datarate Reductions

Data reduction methods are used to reduce the data rate of a stream or file.

It should be noted that in practical implementations, a reduction in the data rate in the delivered signal components is often made in the process of generating the signal. Video data reduction can be achieved by reducing the following parameters:

  • Resolution
  • Chrominance sampling structure (for example from 4:2:2 to 4:2:0)
  • Quantisation depth (eg from 10 bits to 8 bits)

The application of these options has an impact on the original signal quality (and therefore the image quality) and the ability to run 

subsequent processing steps can be considerably reduced, depending on the quality of the source. 

Data reduction can be applied both in the video domain and in the signal domain. The above mentioned options are all removing data points from the signal. The lost data points can not be re-instated perfectly and these processes are thus referred to as “lossy”. Signal domain compression can also be lossless, such as “run length coding” where repeated sequences of bits can be represented with a short label.

Video domain data reduction is based on analysis of the picture content. Image properties such as “relevance” are taken into account – some portions of the picture can be simplified, others completely discarded without the viewer noticing or caring much.

The following approaches to data reduction are part of many compression systems:

  • DCT (Discrete Cosine Transform) for a digital video signal, including SDI, HD-SDI
  • Quantisation of the DCT coefficients
  • Variable Length Coding (VLC)

The DCT Process

The Discrete Cosine Transform (DCT) is a mathematical operation which underpins many digital video compression systems such as JPEG, MPEG2 and MPEG4. DCT does not necessarily lead directly to a reduction in the data rate, but the image information is structured so that the data can be reduced.

The discrete values of a pixel are converted into the spectral distribution. Since the representation as a frequency makes no sense for a single pixel, blocks are defined. In most cases pixel blocks (also referred as macroblocks) of 8x8 pixels are transformed into the frequency domain by means of the DCT. Thus, there are 64 pieces of information about the pixel block, which can also be converted again.

The DCT transformation can be referred to as lossless, although in practice some mathematical rounding errors are introduced due to the finite resources used. Potential data reduction depends on the subsequent processes to which the DCT signal is subjected by the selected compression processes and tools.

Quantisation of the DCT coefficients

The resulting DCT coefficients are weighted with the aid of the subsequent quantisation. This weighting takes into account the specific properties of our visual sense for each individual coefficient. The coefficients that are weighted are then rounded to whole numbers. The combination of weighting and rounding is referred to as "quantization" of the DCT coefficients. The more precisely the corresponding quantization takes place, the more precisely the signal can be restored.

Quantisation is a crucial step within a compression process, since it is subject to irretrievable losses!

The VLC method

Once the number of distinct coefficients has been reduced through quantisation, the most dramatic reduction in data rate is achieved by Variable Length Coding (VLC). The VLC process is completely lossless. This method examines the supplied data stream and the values that are likely to occur more often. These more frequent values are provided with a shorter codeword; the rarer values with a longer codeword. This is exactly how the Morse code was developed –in English the vowel “E” is the most commonly used, as is “T” from the consonants. Therefore those two letters were given the shortest possible code word – a single dot or dash respectively.

The variable-length coding can be used efficiently as a building block in codecs associated with quantisation.

One of the most commonly used approaches  for reducing the data rate, in addition to the redundancy within an image (intraframe), is to reduce the redundancy between successive images (interframe). This is particularly relevant for variants of compression formats that are used for digital broadcasting (lower data rate = lower costs). The best-known examples are the so-called long-GOP (Group of Pictures) variants of MPEG-2 and MPEG-4.

Long-GOP procedures

With the aid of an intermediate cache (image storage) and picture analysis, the compression system attempts to predict the next expected image(s) from the information that it already knows from the previous frames. Only the difference between the prediction and the actual image is then applied to the DCT transformation. The more accurate the prediction, the smaller the difference. This allows for the picture to be more heavily compressed as less information is needed to describe the change between frames.

The Long-GOP prediction process also includes motion analysis within the picture. The encoder uses "motion vectors” to track change and movement within the frame. This allows for a more accurate  prediction of an image during the reconstruction.

Note: The decoder cannot reconstruct a full original image from the differential information alone. A complete image must be sent from time to time to reset the image and predictive/differential decode. These images are referred to as I-frames (intra-frames). With this technique, three image types (frames) can be defined in a group of images (Group of Pictures GOP):

  • I-Frames (Intra-Frame): Reference or support image, which has been encoded without a reference to the previous or following image
  • B-Frames (Bidirectional Frame): A predicted image calculated from both the previous and the following image
  • P-frames (Predicted-Frames): A unilaterally predicted image calculated from the previous image

The distance between two I frames must not be too large, otherwise B-frames and P-frames become inaccurate. The arrangement of the frames is called a "Group of Pictures" (GOP). There are different configurations of I, B and P frame sequences for a GOP, depending on the desired outcome (highest quality vs. lowest data rate).

The Wavelet procedure

In the wavelet method, in contrast to the DCT, no blocks are formed but each pixel is considered in isolation. The image to be compressed is first transformed line-by-line and then processed in the vertical (column-wise). After a filtering process, the resulting coefficients are quantised and inserted into a data stream. The wavelet compression mechanism is becoming very popular because of its high quality.

It is finding its way into cameras, transmission chains and digital cinema. Examples of wavelet encoding uses are JPEG2000 and Dirac for contribution/transmission whilst REDCODE is used with RED cameras to increase the quality of picture.

Compressed data rates

The compression process can then divided by a further stage of constraining of the process Fixed and Variable bit rates.

Fixed Data rate

Producing a fixed data rate compression is relatively easy. The data rate is normally determined by the constraints of the next process. Ie Tape storage, or broadcast bandwidth. Because the data rate is fixed, there is effectively an overhead on some of the data. This overhead causes a compromise in the image quality and a loss compared to an equivalent VBR. A good example of a fixed data rate is a single broadcast channel, which will have a set amount of bandwidth for transmission, e.g. 4Mbits/s and wanting to maximise the use of it.

Variable Bit Rate

Variable bit rate codecs takes full advantage of the statistical analysis of the signal. Typical examples of VBR are Blu-Ray discs, hard drives and modern camera systems ie XAVC .