Journal of the Audio Engineering Society

2019 September - Volume 67 Number 9


Nonlinear Distortion Reduction in Sound Zones by Constraining Individual Loudspeaker Control Effort

Authors: Ma, Xiaohui; Hegarty, Patrick J.; Jørgensen, Kristoffer F.; Larsen, Jakob Juul

A personal sound zone system renders different audio contents to multiple listening groups within the same physical space. Such zones are called bright zones. Personal sound zone systems provide concurrent, interference-free listening experiences to multiple listeners using loudspeaker arrays. Nonlinear distortion in loudspeaker drivers can cause audible artifacts, and the acoustic contrast can be degraded especially at high driving levels. The distortion can be reduced by constraining the total control effort, but artifacts can still be present due to one or several loudspeaker drivers having high control effort. To reduce nonlinear distortion the researcher applied individual control effort constrained acoustic contrast control (ICECACC), where control effort constraints are imposed for each individual loudspeaker driver. Simulations and experiments were performed on a two-sound-zone setup, with one bright and one dark zone using individually control effort constrained acoustic contrast control (ICECACC) or acoustic contrast control (ACC) and a two-tone stimulus generating both harmonic and intermodulation distortion. Frequency resolved measurements show that ICECACC and ACC give nearly identical acoustic contrast at the two fundamental frequencies, but ICECACC has less nonlinear distortion than ACC. Experiments using a multitone stimulus and identical total control efforts also gave reduced nonlinear distortion with ICECACC over ACC, however this was achieved at the expense of contrast. The results show that a compromise can be made between acoustic contrast and nonlinear distortion.

The direction of a sound source in relation to the listener significantly affects the loudness of the sounds it produces, especially in the horizontal plane, where interaural time difference (ITD) is the main localization cue. There is growing awareness of this phenomenon of directional loudness sensitivity (DLS); this has to be taken into account for audio reproduction systems, especially for multichannel. This effect has only been studied for sounds generated and presented directly over headphones, which are not natural listening conditions. The present study aims at investigating this effect on low-frequency noises originating from real sources. Twenty subjects assessed the loudness of stimuli that were presented by both loudspeakers arranged at various locations within a listening room and by a recording with a dummy head and then virtually reproduced through headphones. Results show that the directional loudness sensitivity (DLS) is in agreement with the previously revealed ITD effect. Moreover, the DLS was higher when stimuli were reproduced over headphones than over loudspeakers, specifically when frontal sources were located at a short distance from the listeners. One hypothesis for this effect relies on visual cues that were available to the listeners only when sounds were reproduced over loudspeakers, providing information about the source distance. Listeners were also aware that sounds were reproduced on loudspeaker or headphones, possibly involving different loudness assessments, leading to DLS differences.

Generalized Metrics for Constant Directivity

Authors: Sridhar, Rahulram; Tylka, Joseph G.; Choueiri, Edgar Y.


Many applications in audio benefit from transducer arrays whose directional characteristics do not vary with frequency, as for example sound reinforcement and selective microphone beams. The coverage angle should be constant over a usable frequency range. Metrics are proposed for quantifying the extent to which a transducer’s polar radiation (or sensitivity) pattern is invariant with frequency. As there is currently no established measure of this quality (often called “controlled” or “constant directivity”), this paper proposes five metrics, each based on commonly-used criteria for constant directivity: 1) a Fourier analysis of sensitivity contour lines (i.e., lines of constant sensitivity over frequency and angle), 2) the average of spectral distortions within a specified angular listening window, 3) the solid angle of the frontal region with distortions below a specified threshold, 4) the standard deviation of the directivity index, and 5) cross-correlations of polar responses. These metrics are computed for ten loudspeakers, which are ranked from most constant-directive to least, according to each metric. The resulting values and rankings are compared, and the suitability of each metric for comparing transducers in different applications is assessed. For critical listening applications in reflective or dynamic listening environments, metric 1 appears most suitable, while for such applications in acoustically-treated and static environments, metrics 2 and 3 may be preferable. Furthermore, for high-amplitude applications (e.g., live sound) in reflective or noisy environments, metrics 4 and 5 appear most suitable.

Engineering reports

CD-4 (or Compatible Discrete 4 Channel) was a short-lived, four-channel, surround-sound system for phonograph records. Developed by JVC in Japan, the system was adopted in America around 1972 by RCA where it was known as RCA Quadradisc. Unlike matrix quadraphonic systems, CD-4 took a more radical approach. The baseband signals, which modulate the groove, are the sum of the front and back signals (LF + LB) and (RF + RB). The difference signals, used to separate back from front in the decoder, are FM encoded on a pair of ultrasonic (30kHz) subcarriers recorded above this baseband signal. The development of a new, software-based decoder for CD-4 phonograph records is described in this report. A relatively complete understanding of the original hardware decoders is necessary, and this analysis is new. A special phono cartridge with an extended frequency-response up to 45 kHz is required, and this must be fitted with a Shibata or line-contact stylus to track the high-frequency subcarrier modulation. In addition, wide bandwidth preamplifiers, correct cable types, and low crosstalk are all required to recover subcarrier signals of sufficient quality and amplitude so that successful decoding is possible. A different approach to the output matrix is described based on Ambisonics theory, which increases the reliability of successfully decoding worn and damaged CD-4 media.

A binaural technique (involving direct control of signals transferred into both ears of listeners), not only can solve the problem of spatial impression of headphone reproduction but also has the ability to provide realistic auditory experiences, especially in 3D spatial acoustic reproduction. In this study, monophonic source signals were processed by frequency-band decomposition and distribution to achieve spatially widened perceived source widths in binaural synthesis. Stimuli with different widths were synthesized, and the perceived widths were evaluated by conducting a listening experiment to investigate the relationship of the perceived width and the synthesized width. Three different bandwidths of frequency bands and two center positions of synthesized widths were used in the processing, and the relevant effects on perception of source width were investigated. The results of the listening experiment suggested that under proper processing conditions the perceived width could increase with increasing synthesized widths. However, dependencies of source signal characteristics and variations between participants were observed. Degradations of timbre and spatial quality were also evaluated. The results suggested that this method suffered less degradation than a conventional decorrelation method while it achieved comparable widening effects for binaural reproduction. For example, for a cello source signal with 1/12-octave bandwidth, the perceived width increased with increasing synthesis width. This suggests that under appropriate conditions this method could control the perceived width of a monophonic source in binaural synthesis.

A Cross-Evaluated Database of Measured and Simulated HRTFs Including 3D Head Meshes, Anthropometric Features, and Headphone Impulse Responses

Authors: Brinkmann, Fabian; Dinakaran, Manoj; Pelzer, Robert; Grosche, Peter; Voss, Daniel; Weinzierl, Stefan

The individualization of head related transfer functions (HRTFs) can make an important contribution to improving the quality of binaural technology applications. One approach to individualization is to exploit the relationship between the shape of HRTFs and the anthropometric features of the ears, head, and torso of the corresponding listeners. To identify statistically significant relationships between the two sets of variables, a relatively large database is required. For this purpose full-spherical HRTFs of 96 subjects were acoustically measured and numerically simulated. A detailed cross-evaluation showed a good agreement to previous data between repeated measurements and between measured and simulated data. In addition to 96 HRTFs, the database includes high-resolution head-meshes, a list of 25 anthropometric features per subject, and headphone transfer functions for two headphone models.

Standards and Information Documents

AES Standards Committee News


When working on “with-height” spatial audio, the pressure to abandon specific channel-based production formats will grow, given that the number of possible reproduction systems is increasingly large. The responsibility then lies increasingly on the playback rendering system to do a good job of delivering a convincing impression of the original intention on whatever reproduction system it is presented with. Preserving or delivering authentic or plausible spatial characteristics of both the direct and diffuse elements of a scene becomes the target.

JAES Special Issue on Semantic Music Production, Call for Papers

2020 AES Academy, Anaheim, Call for Contributions

148th Convention, Vienna, Call for Contributions

Audio Education Conference, Murfreesboro and Nashville, Call for Contributions

Audio for Virtual and Augmented Reality Conference, Redmond, Call for Contributions

Audio Forensics Conference Report, Porto


AES Conventions and Conferences


Table of Contents

Cover & Sustaining Members List

AES Officers, Committees, Offices & Journal Staff

Institutional Subscribers: If you would like to log into the E-Library using your institutional log in information, please click HERE.

Choose your country of residence from this list:

Skip to content