Journal of the Audio Engineering Society

2014 December - Volume 62 Number 12


MPEG-H Audio—The New Standard for Universal Spatial/3D Audio Coding

Authors: Herre, Jürgen; Hilpert, Johannes; Kuntz, Achim; Plogsties, Jan

In order to facilitate high-quality bitrate-efficient distribution and flexible reproduction of 3D sound, the MPEG standardization group is engaged in the development of MPEG-H Audio Coding. This proposed standard will allow for universal carriage of encoded 3D sound from channel-based, object-based, and HOA-based sound formats. Reproduction is supported for many output setups ranging from 22.2 and beyond down to 5.1, stereo, and binaural reproduction. Depending on the available reproduction configuration, the encoded material is rendered to yield the highest spatial audio quality, overcoming the incompatibility between various 3D reproduction formats. Moreover, MPEG-H Audio is a unified system for carriage of channel-oriented, object-oriented, and Higher Order Ambisonics-based content. This paper describes the current status of the standardization project and provides an overview of the system architecture, technology, capabilities, and current performance.

Spatial Audio Quality Perception (Part 1): Impact of Commonly Encountered Processes

Authors: Conetta, Robert; Brookes, Tim; Rumsey, Francis; Zielinski, Slawomir; Dewhirst, Martin; Jackson, Philip; Bech, Søren; Meares, David; George, Sunish


Spatial audio processes (SAPs) commonly encountered in consumer audio reproduction systems are known to produce a range of impairments to spatial quality. By way of two listening tests, this paper investigated the degree of degradation of the spatial quality of six 5-channel audio recordings resulting from 48 such SAPs. Perceived degradation also depends on the particular listeners, the program content, and the listening location. For example, combining off-center listener with another SAP can reduce spatial quality significantly when compared to listening to that SAP from a central location. The choice of the SAP can have a large influence on the degree of degradation. Taken together these findings and the quality-annotated database can guide the development of a regression model of perceived overall spatial audio quality, incorporating previously developed spatially-relevant feature-extraction algorithms. The results can guide the development of an artificial-listener-based evaluation system.

Spatial Audio Quality Perception (Part 2): A Linear Regression Model

Authors: Conetta, Robert; Brookes, Tim; Rumsey, Francis; Zielinski, Slawomir; Dewhirst, Martin; Jackson, Philip; Bech, Søren; Meares, David; George, Sunish


The QESTRAL (Quality Evaluation of Spatial Transmission and Reproduction using an Artificial Listener) system is intended to be an artificial-listener-based evaluation system capable of predicting the perceived spatial quality degradations resulting from SAPs (Spatial Audio Processes) commonly encountered in consumer audio reproduction. A generalizable model employing just five metrics and two principal components performs well in its prediction of the quality over a range of program types. Commonly-encountered SAPs can have a large deleterious effect on several spatial attributes including source location, envelopment, coverage angle, ensemble width, and spaciousness. They can also impact timbre, and changes to timbre can then influence spatial perception. Previously obtained data was used to build a regression model of perceived spatial audio quality in terms of spatial and timbral metrics. In conjunction with two simple probe signals, the resulting model can form the core of an evaluation system.

Interchannel coherence is one of the key features of stereo and multichannel audio that contribute to the perception of width. Certain production techniques, most commonly those using linear coincident microphones, are characterized by excessively high interchannel coherence. Addressing this issue, the authors propose a blind spatial sound enhancement technique that adjusts the coherence in frequency bands while minimizing the change of timbre. The adaptive processing uses regularized least squares optimized mixing, decorrelation, and bypassed onsets. Results of listening experiments show significant improvements by the processing to the attributes width, preference, and overall sound quality without observing adverse effects.


Subjective listening tests were conducted to investigate how the spacing between main (lower) and height (upper) microphone layers in a 3D main microphone array affects perceived spatial impression and overall preference. It was generally found that layer spacing of 0.5 m, 1 m, and 1.5 m did not produce significant differences in either perceived spatial impression or preference. The 0-m layer had slightly higher ratings than the spaced layers in both spatial impression and preference, depending on the type of source. The four configurations were compared with trumpet, acoustic guitar, percussion quartet, and string quartet using a 9-channel loudspeaker setup. It is suggested that the perceived results were mainly associated with vertical interchannel crosstalk in the signals of each height layer and the magnitude and pattern of spectral change at the listener’s ear caused by each layer. Informal comments suggested that the main preference attributes were tonal quality, as well as spatial quality.

Standards and Information Documents

AES Standards Committee News


Loudness Revisited

Authors: Rumsey, Francis

[Feature] It is a feature of modern life that things are loud, but action can be taken to limit the effects of loudness competition in audio-related services. During the 137th Convention, experts from around the world gathered to debate what is being done to control excessive loudness and deal with problems of dialog intelligibility in consumer systems and the cinema.

Call for Awards Nominations

137th Convention Report, Los Angeles

57th Conference Preview, Hollywood

58th Conference, Aalborg, Call for Papers

59th Conference, Montreal, Call for Papers

139th Convention, New York, Call for Papers

Call for Nominations for Board of Governors


Index to Volume 62

Products and Developments

137th Convention Audio mp3 order form

Annual Report

AES Conventions and Conferences


Table of Contents

Cover & Sustaining Members List

AES Officers, Committees, Offices & Journal Staff

Institutional Subscribers: If you would like to log into the E-Library using your institutional log in information, please click HERE.

Choose your country of residence from this list:

Skip to content