Journal of the Audio Engineering Society

2021 January/February - Volume 69 Number 1/2



Along with the recent advance of multichannel 3D audio technologies, a number of new microphone techniques for 3D sound recording have been proposed over the years. To choose a technique that is most suitable for the intended goal of a recording, it is first necessary to understand the design principles, pros, and cons of different techniques. This paper first categorizes existing 3D microphone arrays according to their physical configurations, design philosophies, and purposes, followed by an overview of each array. Studies that have subjectively or objectively evaluated different microphone arrays are also reviewed. Different approaches in the configuration of upper microphone layer are discussed, aiming to provide theoretical and practical insights into how they can contribute to creating an immersive auditory experience. Finally, limitations of previous studies and future research topics in 3D sound recording are identified.

The Evolution and Design of Flat-Panel Loudspeakers for Audio Reproduction

Authors: Heilemann, Michael C.; Anderson, David A.; Roessner, Stephen; Bocko, Mark F.


The underlying physics and the design of loudspeakers that radiate sound through the bending vibrations of elastic panels, here referred to generically as flat-panel loudspeakers, are reviewed in this paper. The form factor, reduced weight, and aesthetic appeal of flat-panel speakers have made them a topic of interest for more than 90 years, but these advantages have been overshadowed by acoustical shortcomings, specifically the uneven frequency response and directivity in comparison to conventional cone-radiator loudspeakers. Fundamentally, the design challenges of flat-panel speakers arise from the intrinsically large number of mechanical degrees of freedom of a panel radiator. A number of methods have been explored to compensate for the acoustical shortcomings of flat-panel speakers, such as employing inverse filters, equalization, canceling mechanical resonances with actuator arrays, and modifying the panel material, shape, structure, and boundary conditions. Such methods have been used in various combinations to achieve significant audio performance improvements, and carefully designed flat-panel loudspeakers have been rated in blind listening tests as competitive with some prosumer-grade conventional loudspeakers. This review presents a brief historical account of the evolution of flat-panel loudspeakers and summarizes the essential physics and design methodologies that have been developed to optimize their fidelity and directional response.

Manifold Learning Methods for Visualization and Browsing of Drum Machine Samples

Authors: Shier, Jordie; McNally, Kirk; Tzanetakis, George; Brooks, Ky Grace

The use of electronic drum samples is widespread in contemporary music productions, with music producers having an unprecedented number of samples available to them. The task of organizing and selecting from these large collections can be challenging and time consuming, which points to the need for improved methods for user interaction. This paper presents a system that computationally characterizes and organizes drum machine samples in two dimensions based on sound similarity. The goal of the work is to support the development of intuitive drum sample browsing systems. The methodology presented explores time segmentation, which isolates temporal subsets from the input signal prior to audio feature extraction, as a technique for improving similarity calculations. Manifold learning techniques are compared and evaluated for dimensionality reduction tasks, and used to organize and visualize audio collections in two dimensions. This methodology is evaluated using a combination of objective and subjective methods including audio classification tasks and a user listening study. Finally, we present an open-source audio plug-in developed using the JUCE software framework that incorporates the findings from this study into an application that can be used in the context of a music production environment.


The soundtracks of movies are composed and mixed in various listening environments and the final mix is reproduced in cinemas. The variation of electroacoustical properties between the rooms could be significant, and mixes do not translate easily from one location to another. This study aims to elicit the audible differences between six different movie listening environments, which are auralized to an anechoic listening room with 45 loudspeakers. A listening test was performed to determine the attributes that describe the alterations in the sound field between the rooms. Experienced listeners formulated a vocabulary and created an attribute set containing 19 descriptive attributes. The most important attribute was the sense of space when dialogue was evaluated. Moreover timbre and especially brightness were important when music was evaluated. Furthermore, the change of width and clarity of the sound field was considered important.

In this study, the assessors evaluated the alterations in the sound field of six movie listening environments. The sound fields of the listening environments were auralized to an anechoic listening room with 45 loudspeakers so that assessors could compare the rooms with each other directly. 31 experienced listeners evaluated five descriptive attributes on a continuous scale for each room with two program material items, dialogue and music. The preference ratings for the rooms were also collected. The perceptual evaluations were compared to the objective electroacoustic data of the rooms. The sense of space, clarity, and distance match the measured clarity C50 at the middle frequencies, while the brightness matches the level of the high frequencies in the electroacoustic response above 4 kHz. No psychoacoustical support was found for the current standards, according to which the high frequencies should be attenuated more in large cinemas with longer reverberation than in small cinemas. It turned out that the movie sound professionals do not prefer either too dead or too live listening environments.

Intermodulation Distortion Analysis of a Guitar Distortion Pedal With a Starving Circuit

Authors: Inui, Masaki; Hamasaki, Toshihiko; van der Veen, Menno


Despite the recent trend of digital transformation in the music industry, the popularity of guitar effects pedals (GEPs) designed with analog components has not declined. This paper describes the complexity of the nonlinear characteristics of the analog circuitry in a distortion pedal, which originates not only from clipping diodes but also from the integrated operational amplifier itself. It is well known that variation in the supply voltage of a distortion pedal influences its sound. Based on this phenomenon, we have designed a voltage-starving circuit to control various distorted transfer functions depending on the frequency. Particular attention is given to the difference between odd and even nonlinearity in the mechanism of generating intermodulation distortion (IMD) for two-tone dissonance and consonance. These transfer functions are analyzed in detail using a 9th-order polynomial approximation. As a result, all peaks of the intermodulation frequencies are successfully identified in a complex spectrum. Furthermore the spectral shape of the measured IMD peaks is reproduced with an error of less than 50 dB by the simulation of their approximate formula.

Assessing Spherical Harmonics Interpolation of Time-Aligned Head-Related Transfer Functions

Authors: Arend, Johannes M.; Brinkmann, Fabian; Pörschmann, Christoph


High-quality spatial audio reproduction over headphones requires head-related transfer functions (HRTFs) with high spatial resolution. However, acquiring datasets with a large number of (individual) HRTFs is not always possible, and using large datasets can be problematic for real-time applications with limited resources. Consequently, interpolation methods for sparsely sampled HRTFs are of great interest, with spherical harmonics (SH) interpolation becoming increasingly popular. However, the SH representation of sparse HRTFs suffers from spatial aliasing and order truncation errors. To mitigate this, preprocessing methods have been introduced that time-align the sparse HRTFs before SH interpolation. This reduces the effective SH order and thus the number of HRTFs required for SH interpolation. In this paper, we present a physical evaluation of four state-of-the-art preprocessing methods, which showed very similar performance of the methods with notable differences only at low SH orders and contralateral HRTFs. We also performed a listening experiment with one selected method to determine the minimum required SH order required for perceptually transparent interpolation. For the selected method, a sparse HRTF set of order N ˜ 7 is sufficient for interpolating a frontal source presenting speech or percussion. Higher orders are, however, required for a lateral source and noise.

Standards and Information Documents

AES Standards Committee News

Download: PDF (52.09 KB)


The attribute of presence was explored in relation to immersive audio, in papers presented at the Fall 2020 convention. We learn about the success of rendering techniques for immersive spatial audio, as well as the emulation and capture of acoustic spaces. An attempt is described to develop a recording and mixing approach aimed at Dolby Atmos immersive reproduction.

149th Conference Report

Download: PDF (531.75 KB)

149th abstracts

Download: PDF (569.02 KB)

149th Exhibitors and Sponsors

Download: PDF (141.14 KB)

AES New Officers

Download: PDF (248.27 KB)


Section News

Download: PDF (255.88 KB)

Book Review

Download: PDF (115.75 KB)

New Products

Download: PDF (375.75 KB)


Download: PDF (76.66 KB)


Table of Contents

Download: PDF (42.72 KB)

Cover & Sustaining Members List

Download: PDF (76.9 KB)

AES Officers, Committees, Offices & Journal Staff

Download: PDF (75.12 KB)

Institutional Subscribers: If you would like to log into the E-Library using your institutional log in information, please click HERE.

Choose your country of residence from this list:

Skip to content