You are currently logged in as an
Institutional Subscriber.
If you would like to logout,
please click on the button below.
Home / Publications / Journal-Online
Only AES members and Institutional Journal Subscribers can download
*Only AES members and Institutional Journal Subscribers can download.
Authors: Herre, Jürgen; Hilpert, Johannes; Kuntz, Achim; Plogsties, Jan
In order to facilitate high-quality bitrate-efficient distribution and flexible reproduction of 3D sound, the MPEG standardization group is engaged in the development of MPEG-H Audio Coding. This proposed standard will allow for universal carriage of encoded 3D sound from channel-based, object-based, and HOA-based sound formats. Reproduction is supported for many output setups ranging from 22.2 and beyond down to 5.1, stereo, and binaural reproduction. Depending on the available reproduction configuration, the encoded material is rendered to yield the highest spatial audio quality, overcoming the incompatibility between various 3D reproduction formats. Moreover, MPEG-H Audio is a unified system for carriage of channel-oriented, object-oriented, and Higher Order Ambisonics-based content. This paper describes the current status of the standardization project and provides an overview of the system architecture, technology, capabilities, and current performance.
Download: PDF (HIGH Res) (505.03 KB)
Download: PDF (LOW Res) (265.32 KB)
Authors: Conetta, Robert; Brookes, Tim; Rumsey, Francis; Zielinski, Slawomir; Dewhirst, Martin; Jackson, Philip; Bech, Søren; Meares, David; George, Sunish
Spatial audio processes (SAPs) commonly encountered in consumer audio reproduction systems are known to produce a range of impairments to spatial quality. By way of two listening tests, this paper investigated the degree of degradation of the spatial quality of six 5-channel audio recordings resulting from 48 such SAPs. Perceived degradation also depends on the particular listeners, the program content, and the listening location. For example, combining off-center listener with another SAP can reduce spatial quality significantly when compared to listening to that SAP from a central location. The choice of the SAP can have a large influence on the degree of degradation. Taken together these findings and the quality-annotated database can guide the development of a regression model of perceived overall spatial audio quality, incorporating previously developed spatially-relevant feature-extraction algorithms. The results can guide the development of an artificial-listener-based evaluation system.
Download: PDF (HIGH Res) (1.89 MB)
Download: PDF (LOW Res) (456.91 KB)
Authors: Conetta, Robert; Brookes, Tim; Rumsey, Francis; Zielinski, Slawomir; Dewhirst, Martin; Jackson, Philip; Bech, Søren; Meares, David; George, Sunish
The QESTRAL (Quality Evaluation of Spatial Transmission and Reproduction using an Artificial Listener) system is intended to be an artificial-listener-based evaluation system capable of predicting the perceived spatial quality degradations resulting from SAPs (Spatial Audio Processes) commonly encountered in consumer audio reproduction. A generalizable model employing just five metrics and two principal components performs well in its prediction of the quality over a range of program types. Commonly-encountered SAPs can have a large deleterious effect on several spatial attributes including source location, envelopment, coverage angle, ensemble width, and spaciousness. They can also impact timbre, and changes to timbre can then influence spatial perception. Previously obtained data was used to build a regression model of perceived spatial audio quality in terms of spatial and timbral metrics. In conjunction with two simple probe signals, the resulting model can form the core of an evaluation system.
Download: PDF (HIGH Res) (2.4 MB)
Download: PDF (LOW Res) (334.12 KB)
Authors: Vilkamo, Juha; Pulkki, Ville
Interchannel coherence is one of the key features of stereo and multichannel audio that contribute to the perception of width. Certain production techniques, most commonly those using linear coincident microphones, are characterized by excessively high interchannel coherence. Addressing this issue, the authors propose a blind spatial sound enhancement technique that adjusts the coherence in frequency bands while minimizing the change of timbre. The adaptive processing uses regularized least squares optimized mixing, decorrelation, and bypassed onsets. Results of listening experiments show significant improvements by the processing to the attributes width, preference, and overall sound quality without observing adverse effects.
Download: PDF (HIGH Res) (458.82 KB)
Download: PDF (LOW Res) (290.55 KB)
Authors: Lee, Hyunkook; Gribben, Christopher
Subjective listening tests were conducted to investigate how the spacing between main (lower) and height (upper) microphone layers in a 3D main microphone array affects perceived spatial impression and overall preference. It was generally found that layer spacing of 0.5 m, 1 m, and 1.5 m did not produce significant differences in either perceived spatial impression or preference. The 0-m layer had slightly higher ratings than the spaced layers in both spatial impression and preference, depending on the type of source. The four configurations were compared with trumpet, acoustic guitar, percussion quartet, and string quartet using a 9-channel loudspeaker setup. It is suggested that the perceived results were mainly associated with vertical interchannel crosstalk in the signals of each height layer and the magnitude and pattern of spectral change at the listener’s ear caused by each layer. Informal comments suggested that the main preference attributes were tonal quality, as well as spatial quality.
Download: PDF (HIGH Res) (5.11 MB)
Download: PDF (LOW Res) (724.05 KB)
Authors: Rumsey, Francis
[Feature] It is a feature of modern life that things are loud, but action can be taken to limit the effects of loudness competition in audio-related services. During the 137th Convention, experts from around the world gathered to debate what is being done to control excessive loudness and deal with problems of dialog intelligibility in consumer systems and the cinema.
Download: PDF (243.68 KB)
Download: PDF (34.63 KB)
Download: PDF (2.34 MB)
Download: PDF (497.72 KB)
Download: PDF (70.24 KB)
Download: PDF (60.07 KB)
Download: PDF (70.11 KB)
Download: PDF (27.44 KB)
Download: PDF (130.78 KB)
Download: PDF (338.58 KB)
Download: PDF (110.36 KB)
Download: PDF (758.79 KB)
Download: PDF (22.72 KB)
Download: PDF (79.47 KB)
Download: PDF (56.89 KB)
Download: PDF (41.27 KB)
Download: PDF (58.53 KB)
Institutional Subscribers: If you would like to log into the E-Library using your institutional log in information, please click HERE.