Capturing audio for 3D immersive reproduction

The recently proposed multichannel audio formats such as Dolby Atmos, Auro-3D and NHK 22.2 employ height channels to provide the auditory sensation of a “three-dimensional (3D)” space. This project, funded by EPSRC (EP/L019906/1), aims to provide fundamental psychoacoustic principles for the perception, recording and reproduction of height dimension in 3D reproduction. Below is the summary of some of the main findings and outcomes from this project so far.

  • The effect of vertical microphone spacing in a main microphone array on perceived spatial impression in 3D reproduction is not significant (Lee and Gribben 2014). This led to the design of a 3D microphone array called PCMA-3D, which is horizontally spaced but vertically coincident. This finding has also been adopted in the new design of Schoeps’s ORTF-3D microphone array, which is a great example of innovation through academic research.
  • In order to avoid unwanted upwards shifting of source image in 3D reproduction, a direct sound captured or reproduced from a height channel (i.e. vertical interchannel crosstalk) should be at least 7dB attenuated compared to the same sound captured or reproduced from the main channel (Lee 2012; Wallis and Lee 2016; Wallis and Lee 2017). This implies that a directional microphone serving a height channel should be sufficiently angled upwards to reduce the amount of direct sound.
  • Interchannel time difference is not a reliable cue for vertical phantom imaging (Wallis and Lee 2015). In other words, vertical summing localisation using the time delay cue does not work. Furthermore, the precedence effect does not operate in the vertical plane in the strict sense. Some localisation dominance towards the earlier source can be observed depending on the frequency band, but the perceived image position is never shifted fully to the earlier source.
  • It has been found that the effect of vertical interchannel decorrelation is minimal compared to that of horizontal decorrelation (Gribben and Lee 2017). The effect of vertical decorrelation is significant, albeit small effect size, only above around 500Hz (Gribben and Lee 2018).
  • A large scale library of impulse responses captured for 13 source positions and 40 different microphone array configurations from stereo to 3D has been established (Lee and Millns 2017). This is available for free download at
  • A Pure Audio Bluray album has been produced and released in Dolby Atmos, Auro-3D 9.1 and DTS 5.1 formats for Siglo de Oro choir.

Project funded by EPSRC (EP/L0199061/1) and the University of Huddersfield

Researchers: Dr Hyunkook Lee, Dr Dale Johnson, Dr Christopher Gribben, Dr Rory Wallis, Connor Millns

Supervisor: Dr Hyunkook Lee


Next Post
ASPEN (APL Spatial Audio Engine)
Previous Post
Listener-perspective Dependency of Spatial Impression in a Reverberant Concert Hall