Autor: Karl M. Slavik

Next Generation Audio and Binaural Recording and Processing

Important for the Future: Next Generation Audio & Binaural Sound Techniques.

With the introduction of new audio codecs, Dolby AC-4 and MPEG-H, producers are being challenges to think about object-based 3D audio, but also consider more effective sound mixes.

One of the biggest advantages of the newer audio codecs is the ability to embed an audio description track for viewers with access requirements. This is superimposed over the program audio and has the ability to be configured by the listener (receiver Mixed) or by the broadcaster through their transmitter configuration (broadcast Mixed).

In the future viewers may be able to remove the commentary of a sports presenter that they dislike, or chose another language version. Even Skype conferencing and audio from a ‘second screen’ (tablet or smartphone) can be used together with the TV sound to create a personalised mix. 

It is also conceivable that viewers may be able to specify the mixing ratio between the background sounds (music, sound effects, atmos) and the dialogue itself to aid speech intelligibility. This could be very popular for hearing impaired viewers. Next Generation Audio Formats (NGA) do require changes in the production chain and in production processes. The introduction of these formats needs to be accompanied by the generation and checking of dynamic metadata for the respective encoding system and presents a new challenge. 

New ways of working for the acquisition of object-based audio,  as well as for the personalisation of audio programs (eg for the selection of audio tracks) have to be introduced – and viewers need to understand how to use them correctly. The measurement of loudness for these new audio formats, as well as the determination of loudness for the different user formats (down-mixing) is another complex issue that will need clarifying before the introduction of new standards. Currently, only a few manufacturers have suitable monitoring equipment.

Binaural Sound Mix

Binaural recording methods have existed for a long time. The use of a dummy head with microphones in the ears creates an excellent and extremely natural, three-dimensional rendering of the soundscape, and it only requires the use of two audio transmission channels. This makes it possible to use all components of how humans orient themselves in an environment. However level differences between the ears and the head-related transfer function (HRTF) need to be taken into account. The "head-related transfer function" describes the complex filtering effects of head, outer ear and body of the listener to ensure that the sound is played back correctly. This should be tailored to the individual headphones of viewers, to ensure that the sound is correctly configured for their use. Even without this process listening to binaural recordings via headphones can be staggeringly immersive, and when mixed to loudspeakers the sound is still very satisfactory.

KU100 Z dummy head for binaural recording

One major disadvantage of the binaural recording process is that conventional production methods like miccing and mixing need to be modified to take into account new equipment and production processing This can have a financial impact. Therefore a binaural record doesn’t replace, but complements existing audio by the provision of added audio signals. Significant advances in the field of digital signal processing and more powerful processors in smartphones, tablets and home devices allow the application of binaural sound for gaming, movies and TV. These uses include device specific processing of object based audio or Dolby Atmos, for viewers who watch lots of their programmes on mobile devices or tablets. The ability of a smart phone to use computational power to process audio represents an interesting opportunity for new sound innovation in television programme viewing. 

Realiser A8 of Symth Research with head tracking

In the future true binaural productionswill replace existing microphones and place the listener in the centre of an event such as a classical concert. This will require binaural panning in post production, and then there is the question of whether camera operators are happy to have a binaural dummy head in the shot. And of course for those that dislike or cannot access binaural sound, the production may have to additionally offered a ‘normal’ stereo and surround signal.

Author: Karl M. Slavik, Arte Cast Vienna

Related Articles: