Journal of the Audio Engineering Society

The Journal of the Audio Engineering Society — the official publication of the AES — is the only peer-reviewed journal devoted exclusively to audio technology. Published 10 times each year, it is available to all AES members and subscribers.


The Journal contains state-of-the-art technical papers and engineering reports; feature articles covering timely topics; pre and post reports of AES conventions and other society activities; news from AES sections around the world; Standards and Education Committee work membership news, new products, and newsworthy developments in the field of audio.


If you are experiencing any issues with the E-library or the Online Journals access, please fill in this form.

2024 June - Volume 72 Number 6


Non-Exponential Reverberation Modeling Using Dark Velvet Noise

Authors: Fagerström, Jon; Schlecht, Sebastian J.; Välimäki, Vesa

Previous research on late-reverberation modeling has mainly focused on exponentially decaying room impulse responses, whereas methods for accurately modeling non-exponential reverberation remain challenging. This paper extends the previously proposed basic dark-velvet-noise reverberation algorithm and proposes a parametrization scheme for modeling late reverberation with arbitrary temporal energy decay. Each pulse in the velvet-noise sequence is routed to a single dictionary filter that is selected from a set of filters based on weighted probabilities. The probabilities control the spectral evolution of the late-reverberation model and are optimized to fit a target impulse response via non-negative least-squares optimization. In this way, the frequency-dependent energy decay of a target late-reverberation impulse response can be fitted with mean and maximum reverberation-time errors of 4% and 8%, respectively, requiring about 50% less coloration filters than a previously proposed filtered-velvet-noise algorithm. Furthermore, the extended dark-velvet-noise reverberation algorithm allows the modeled impulse response to be gated, the frequency-dependent reverberation time to be modified, and the model’s spectral evolution and broadband decay to be decoupled. The proposed method is suitable for the parametric late-reverberation synthesis of various acoustic environments, especially spaces that exhibit a non-exponential energy decay, motivating its use in musical audio and virtual reality.

Efficient Velvet-Noise Convolution in Multicore Processors

Authors: Belloch, Jose Antonio; Badia, Jose M.; Leon, German; Välimäki, Vesa


Velvet noise, a sparse pseudo-random signal, finds valuable applications in audio engineering, such as artificial reverberation, decorrelation filtering, and sound synthesis. These applications rely on convolution operations whose computational requirements depend on the length, sparsity, and bit resolution of the velvet-noise sequence used as filter coefficients. Given the inherent sparsity of velvet noise and its occasional restriction to a few distinct values, significant computational savings can be achieved by designing convolution algorithms that exploit these unique properties. This paper shows that an algorithm called the transposed double-vector filter is the most efficient way of convolving velvet noise with an audio signal. This method optimizes access patterns to take advantage of the processor’s fast caches. The sequential sparse algorithm is shown to be always faster than the dense one, and the speedup is linearly dependent on sparsity. The paper also explores the potential for further speedup on multicore platforms through parallelism and evaluate the impact of data encoding, including 16-bit and 32-bit integers and 32-bit floating-point representations. The results show that using the fastest implementation of a long velvet-noise filter, it is possible to process more than 40 channels of audio in real time using the quad-core processor of a modern system-on-chip.

A Curvilinear Transfer Function for Wide Dynamic Range Compression with Expansion

Authors: Sokolova, Alice; Aksanli, Baris; Harris, Fredric; Garudadri, Harinath

Wide Dynamic Range Compression in hearing aids is becoming increasingly more complex as the number of channels and adjustable parameters grow. At the same time, there is growing demand for customization and user self-adjustment of hearing aids, necessitating a balance between complexity and user accessibility. Compression in hearing aids is governed by the input-output transfer function, which relates input magnitude to output magnitude, and is typically defined as a combination of linear piecewise segments resembling logarithmic behavior. This work presents an alternative to the conventional compression transfer function that consolidates multiple compression parameters and revisits expansion in hearing aids. The curvilinear transfer function is a continuous curve with logarithm-like behavior, governed by two parameters—gain and compression ratio. Experimental results show that curvilinear compression reduces the amplification of low-level noise, improves signal-to-noise ratio by up to 1.0 dB, improves sound quality as measured by the Hearing Aids Speech Quality Index by up to 6.7%, and provides comparable intelligibility as measured by the Hearing Aids Speech Perception Index, with simplified parameterization compared to conventional compression. The consolidated curvilinear transfer function is highly applicable to over-the-counter hearing aids and offers more capabilities for customization than current prominent over-the-counter and self-adjusted hearing aids.

Investigating Individual, Loudness-Dependent Equalization Preferences in Different Driving Sound Conditions

Authors: Rennies, Jan; Buchholz, Sina; Volgenandt, Andreas; Bruns, Tobias; Rollwage, Christian; Appell, Jens-E.

In automotive audio playback systems, dynamically increasing driving sounds are typically taken into account by applying a generic, i.e., non-individualized, increase in overall level and low-frequency amplification to compensate increased masking. This study investigated the degree of individuality regarding the preferences of noise-dependent level and equalizer settings. A user study with 18 normal-hearing participants was conducted in which individually preferred level-dependent and frequency-dependent amplification parameters were determined using a music-based procedure in quiet and in nine different driving noise conditions. The comparison of self-adjusted parameters suggested that, on average, participants adjusted higher overall levels and more low-frequency amplification in noise than in quiet. However, preferred self-adjusted levels differedmarkedly between participants for the same listening conditions but were quite similar in a re-test session for each participant, indicating that individual preferences were stable and could be reproducibly measured with the employed personalization scheme. Furthermore, the impact of driving noise on individually preferred settings revealed strong interindividual differences, indicating that listeners can differ widely with respect to their individual optimum of how equalizer and level settings should be dynamically adapted to changes in driving conditions.

Analysis and Model of Temporal Sound Attributes from Recorded Audio

Authors: Moiragias, George; Mourjopoulos, John N.

A computational framework is proposed for analyzing the temporal evolution of perceptual attributes of sound stimuli. As a paradigm, the perceptual attribute of envelopment, which is manifested in different audio sound reproduction formats, is employed. For this, listener temporal ratings of the envelopment for mono, stereo, and 5.0-channel surround music samples, serve as the ground truth for establishing a computational model that can accurately trace temporal changes from such recordings. Combining established and heuristic methodologies, different features of the audio signals were extracted at each segment that envelopment ratings were registered, named long-term (LT) features.Amemory LT computational stage is proposed to account for the temporal variations of the features through the duration of the signal, based on the exponentially weighted moving average of the respective LT features. These are utilized in a gradient tree boosting, machine learning algorithm, leading to a Dynamic Model that accurately predicts the listener’s temporal envelopment ratings.Without the proposed memory LT feature function, a Static Model is also derived, which is shown to have lower performance for predicting such temporal envelopment variations.

Standards and Information Documents

AES Standards Committee News


Call for Papers: Special Issue on The Sound of Digital Audio Effects



AES Officers, Committees, Offices & Journal Staff

Cover & Sustaining Members List

Table of Contents

Institutional Subscribers: If you would like to log into the E-Library using your institutional log in information, please click HERE.

Choose your country of residence from this list:

Skip to content