POST



Digital Audio and Codecs I

Recording and editing of TV sound is now almost exclusively digital using PCM (Pulse Code Modulation).

24-bit audio with a sampling frequency of 48 kHz is the common sound format for television production. These levels are supported by most mixing consoles, audio recorders, codecs, storage and playout systems. 24 bits allow a signal to noise ratio of more than 100 dB (practical value, theoretically 146 dB) and high headroom at the same low distortion. The sampling frequency of 48 kHz allows an audio frequency response of at least 20 -.20,000 Hz which mirrors the human hearing range.
PCM sound provides very high quality, but also high data rate audio for use inside the broadcaster and production chain, however it is not used for transmission to the consumer. To make the audio easily transportable, data reduction is required. As with video signals, the transmission of stereo and surround sound have multiple encoding formats available (see table).

PCM sound for production

Whether stereo, 5.1 or 3D audio, sound should be recorded and edited 48 KHz with linear PCM audio at 24-bit. It’s important to check with your crew on what their equipment can record as some acquisition tools whether, disc or file-based, only offer limited audio resolution of 16 or 20 bits.
For high quality deliveries or when working with Dolby E, the resolution and bit transparency is taken into account. The term "bit transparency" describes the ability of a system to store a signal without any change or loss, so to leave the data bits "transparent", no matter where they have been in the production process.

With the removal of tape formats from audio and video production. Codecs are now used to replace these formats, as audio and video are wrapped into files. Modern video codecs such as DNxHR , ProRes , XAVC Intra, H.264 and H.265 are not automatically tied to a particular audio codec. For example, H.264 are encoded in a DSLR with PCM sound, however when a TV station using codecs such as Dolby Digital Plus or HE-AAC puts that audio into a transport stream (MPEG-TS) as H264 the audio is multiplexed and compressed to save data rate. XDCAM supports five video codecs and PCM audio with 16/24 Bit at 48 kHz and proxy Audio with A-Law encoding, 8-bit resolution and 8 kHz sampling frequency.




[Illustration: Basic technical properties of some audio recording formats]

Surround sound is preferably recorded as a set of discrete PCM channels (ie 6 x PCM for 5.1), which makes the transmission of the correct tracks, tied to Dolby metadata or metadata presets much easier. SDI and HD-SDI interfaces provide the possibility of transporting up to 4 packages, each with four audio channels (32 channels with dual link).

SDI interfaces to SMPTE 259-M (standard definition) delivering a maximum of four audio packages and therefore 16 audio channels. This should not be used for the storage and distribution of discrete multichannel audio as the phasing of the individual audio groups is not exactly defined, is device dependant on how the supplier has implemented the SDI audio, and in turn can introduce errors when downmixing.

Here Dolby E is recommended as a high-quality way to feed 6 (16-bit) or 8 channels (20 bit) plus metadata into a stereo pair. HD-SDI interfaces to SMPTE 292M, 372M (dual link) and 424M (3G) are designed as phase stable and allow the transport of surround sound without Dolby E.

Data Reduction in production

Data reduction is audio is everywhere – from the MP3s on your phone, to the streaming music services you listen to. However data rate reduction in production should only be used if the quality of the encoding is suitable for professional work or is good enough for some live access points for News reporting – for example Audio-over-IP or satellite links – when there really is no other sound available.

The encoding formats AAC, HE-AAC, Windows Media Audio and MP3 - which are often supplied in audio and video editors – can apply very high levels of compression. However the use of data-reduced material in a production needs to be carefully considered. This can have a significant impact on the final version of the programme once it has been through the television transmission process. In some cases coding artifacts can be heard. It’s important to consult the delivery requirements for the broadcaster you are working with, in order to safeguard the highest possible audio quality.

Author: Karl M. Slavik, Arte Cast Vienna

Related Articles: