Journal of the Audio Engineering Society

The Journal of the Audio Engineering Society — the official publication of the AES — is the only peer-reviewed journal devoted exclusively to audio technology. Published 10 times each year, it is available to all AES members and subscribers.

 

The Joumal contains state-of-the-art technical papers and engineering reports; feature articles covering timely topics; pre and post reports of AES conventions and other society activities; news from AES sections around the world; Standards and Education Committee work membership news, new products, and newsworthy developments in the field of audio.

2024 April - Volume 72 Number 4

Papers


Spatial Matrix Synthesis

Authors: Schmele, Timothy; Garriga, Adan

Spatial Matrix synthesis is presented in this paper. This modulation synthesis technique creates acoustic velocity fields from acoustic pressure signals by using spatial transformation matrices, thus generating complete sound fields for spatial audio. The analysis presented here focuses on orthogonal rotation matrices in both two and three dimensions and compares the results in each scenario with other sound modulation synthesis methods, including amplitude and frequency modulation. As an alternative method for spatial sound synthesis that exclusively modifies the acoustic velocity vector through effects comparable to those created by both amplitude and frequency modulations, Spatial Matrix synthesis is argued to generate inherently spatial sounds, giving this method the potential to become a new musical instrument for spatial music.

The Shepard-tone sequence and Shepard--Risset glissando are classic auditory illusions in which pitch seems to inexhaustibly ascend or descend. Such stimuli have been used in scientific research, as well as for artistic purposes. This paper demonstrates several variations of those illusions, some of which do not appear to have been previously discussed in the literature. Most notably, hybrids of the two illusions are demonstrated, in which discrete Shepard-tone steps are connected by continuous glissandi. It is shown, using a sample of 91 listeners, that such hybrids can disambiguate the perceived direction of motion between two Shepard tones that are a tritone apart, thus overriding what has been called the tritone paradox. In other demonstrations, multiple layers of monaural and binaural beats are embedded into a Shepard--Risset glissando to produce Risset rhythms. Audio files for these and other examples are provided and discussed. Two original MATLAB functions (and equivalent functions in R) are also provided, which can be used to replicate the examples and explore additional variations.

Perceptual Comparison of 3D Audio Reproduction With and Without Bottom Channels

Authors: Howie, Will; Martin, Denis; Marui, Atsushi; Kamekawa, Toru Kim, Sungyoung; Aydin, Aybar; King, Richard

This study examines the perceptual effects of bottom channels, i.e., floor-level loudspeakers, within 3D audio reproduction. Two listening tests were undertaken at three different venues, using experienced subjects. Both experiments involved comparing three different versions of seven different musical and nonmusical sound scenes: the original mix with all three vertical loudspeaker layers active (Full), the bottom layer muted (Cut), and the bottom layer downmixed into the main layer loudspeakers (X). Results indicate that listeners could discriminate between the three reproduction conditions with a very high degree of accuracy, particularly when comparing the "Full vs. Cut" and "Full vs. X" conditions. Subjects found that the most salient aspects of the sound scene in terms of differentiating between reproduction conditions were related to low-frequency energy, changes in horizontal and vertical imaging, and timbre/tone. Discrimination ability between reproduction conditions was consistent across all three listener groups, though subjects' perception of the degree of difference between reproduction conditions across various auditory attributes varied between groups. These differences may be related to subjects' previous experience with 3D audio including bottom channels, venue bottom-layer loudspeaker angles of elevation, and venue acoustic conditions.

Feedforward Headphone Active Noise Control Utilizing Auditory Masking

Authors: Zachos, Panagiotis; Kamaris, Gavriil; Mourjopoulos, John

A novel adaptive Active Noise Control approach for headphones is presented in this work. The proposed method achieves a reduction in the perceived noise-disturbance and thus makes the presented stimuli more acceptable to the listeners. The ANC operates by extracting auditory masking thresholds based on the signal the user intends to listen to, i.e., music, and subsequently designing a time-domain parallel infinite impulse response filter bank by employing a novel system identification strategy, making the proposed method ideal for SoC implementation. A subjective listening test and an objectivemetric are utilized to assess the perceived improvement of themethod and a statistical analysis is performed on the results in order to showthe advantage of the proposedANCmethod compared to traditional approaches in terms of perceived increase in audio quality.

Car Interior Sound Field Zoning Using Optimal Loudspeaker Array and Double Iteration Method

Authors: Ma, Conggan; An, Yuansheng; Shen, Ende; Yu, Donglei; Zhang, Jiayue

Car interior sound field zoning is receiving increasing attention. The weighting factors are generally used to balance the acoustic contrast and the sound field reconstruction error of the sound field zoning control. However the weighting factors often need to be set at different frequencies, so it is complicated to operate, and the optimal control effect cannot be guaranteed in the whole frequency range concerned in a car. To solve the above problem, a car interior sound field zoning method with optimal weighting factors is proposed in this paper. Firstly, the mathematical model of the method combining acoustic contrast control and pressure matching under the constraint of the multi-loudspeaker driving signal is established. Then, the optimal loudspeaker array is obtained by using the genetic algorithm. After that, the optimal weighting factor of each frequency is obtained by using the double iteration procedure. Finally, to verify the method proposed, a sound field zoning control system was built in a car, and the experimental results show that the broadband average acoustic contrast between the bright zone and dark zone in the car is greater than 20 dB, and the average sound field reconstruction error in the bright zone is less than -7 dB.

Engineering reports


A System for Sonic Explorations With Evolutionary Algorithms

Authors: Jónsson, Björn Þór; Erdem, Çagri; Glette, Kyrre

The discovery of new sounds can inspire creativity, and various approaches have been explored in that effort. One approach is the application of evolutionary algorithms to sound synthesis. The work presented here is related to those efforts and focuses on the properties of pattern-producing networks and neuroevolution. The authors chronicle a journey of concept development and implementation iterations. A discussion of a browser-based environment facilitating such explorations is followed by an account of a migration to a more flexible execution technology stack, using the Web Audio API. As a result of that process, the authors introduce a system consisting of a software library and a command-line utility that facilitates further investigations. Those products are intended to serve as tools for further explorations of the applicability of evolutionary algorithms to the synthesis and discovery of inspiring sounds.

Communications


Toward a Standard Listener-Independent HRTF to Facilitate Long-Term Adaptation

Authors: Lladó, Pedro; Pollack, Katharina; Meyer-Kahlen, Nils

Head-related transfer functions (HRTFs) are used in auditory applications for spatializing virtual sound sources. Listener-specific HRTFs, which aim at mimicking the filtering of the head, torso, and pinnae of a specific listener, improve the perceived quality of virtual sound compared to using non-individualized HRTFs. However, using listener-specific HRTFs may not be accessible for everyone. Here, the authors propose as an alternative to take advantage of the adaptation abilities of human listeners to a new set of HRTFs. They claim that agreeing upon a single listener-independent set of HRTFs has beneficial effects for long-term adaptation compared to using several, potentially severely different HRTFs. Thus, the Non-individual Ear MOdel (NEMO) initiative is a first step toward a standardized listener-independent set of HRTFs to be used across applications as an alternative to individualization. A prototype, NEMObeta, is presented to explicitly encourage external feedback from the spatial audio community and to agree on a complete list of requirements for the future HRTF selection.

Standards and Information Documents


AES Standards Committee News

AES Standards Committee News

Departments


Extras


AES Officers, Committees, Offices & Journal Staff

Cover & Sustaining Members List

Table of Contents

AES Officers, Committees, Offices & Journal Staff

Cover & Sustaining Members List

Table of Contents

Institutional Subscribers: If your company or library has an insitutional subscription to the E-Library then click here to access it.

Choose your country of residence from this list:










Skip to content