Journal of the Audio Engineering Society

2015 June - Volume 63 Number 6


Intelligent Multitrack Dynamic Range Compression

Authors: Ma, Zheng; De Man, Brecht; Pestana, Pedro D. L.; Black, Dawn A. A.; Reiss, Joshua D.

An intelligent dynamic range compression (DRC) algorithm, using the CA-DAFX processing architecture, produces the optimal amount of dynamic range for multitrack recordings. The algorithm exploits the interdependence of input audio features, incorporates best practices, and uses subjective evaluation. The classical parameters of a typical compressor (ratio, threshold, knee, attack, and release) are dynamically adjusted depending on extracted features and control rules. Two new audio weighting features, percussiveness and low-frequency strength, were proposed to incorporate the transient nature and spectral content of the signal. The authors applied multiple linear regression models to the subjective results to formulate the ratio and threshold automations that follow the choices of the human operators. The results showed that the algorithm can compete with or outperform semiprofessional mixes in terms of four different perceptual criteria: the appropriateness of the amount of DRC applied, the degree of imperfection, the ability to stabilize the erratic level fluctuations, and overall Preferences.

Some researchers believe that the reproduction of multichannel spatial audio would be improved by using separate transducers for the direct and diffuse components of the sound. This research seeks to empirically test this assumption. The perceptual effect of angular separation in commonly used 5.0 and 7.0 multichannel systems was investigated. Four listening experiments were performed involving several schemes of separation and a variety of experimental conditions. The listeners consistently preferred schemes with separation. The perceptual effects of four types of management of direct and reflected parts of spatial impulse responses (SIR) were considered: using complete SIRs in all channels; removing the direct sound (DS) from all but the center channel in a standard 5.0 system; removing ambience from the center channel in a standard 5.0 system; applying both of the above operations (separating DS from reflected sounds RS in particular channels). Depending on the configuration, separation did in fact have advantages.

A Modified Additive Synthesis Method Using Source-Filter Model

Authors: Korvel, Grazina; Šimonyte, Virginija; Slivinskas, Vytautas

A modified additive synthesis method using a combination of additive synthesis and source filter modeling has been proposed for musical instruments. When applied to a trumpet, the results are perceived as being less synthetic than additive synthesis by itself. Since harmonics as well as inharmonics are important for resynthesized sound quality, the authors segment the frequency range into bands each of which contains a single harmonic with inharmonics. A sinusoidal model is not sufficient to represent both harmonics and inharmonics. A more complex model of a quasipolynomial in each of the frequency bands is proposed. The quasipolynomial can be obtained as the response of a special linear filter to the unit impulse. Thus each harmonic is obtained using a source-filter model. The inputs of this filter are periodic pulses with time-varying amplitudes and slightly varying periods. Two listening tests were performed to evaluate the perceived quality.

Detecting Replicas within Audio Evidence Using an Adaptive Audio Fingerprinting Scheme

Authors: Távora, Rodrigo G. F.; Nascimento, Francisco Assis

Audio authenticity is one of the major tasks for audio forensic experts because it is often a requirement for the admissibility of digital audio evidence. This investigation proposes a passive authentication method based on an adaptive audio fingerprinting scheme to detect forgeries produced by the replication of an audio interval within the same evidence. Several audio fingerprinting systems are analyzed, and an adaptive scheme based on the Fourier spectrum distribution is proposed. The adaptive system is theoretically and empirically adjusted to detect short replicas. Simulations are performed to analyze the robustness against time, frequency domain, and compression distortions. The method has the power to discriminate repeated text speech and distinguish it from audio replicas as short at 0.1 s even in the presence of amplitude and frequency distortions.

Engineering reports

When the motor of a classical electrodynamic loudspeaker has been replaced by a so-called magnet-only motor, current distortion caused by magnetic circuits is significantly reduced. All of the nonlinear soft-iron pieces are removed, and the magnetic field is created by a combination of permanent magnets. Two loudspeakers are compared in the study: the first sample is a mass-produced 6.5-inch loudspeaker. The second sample is built replacing the magnetic circuit of the first loudspeaker by the magnet-only structure. Thus, the moving parts, including the voice-coil, diaphragm, and suspensions are kept identical for both samples, leading to a comparison based only on their motors. The measurements on the original loudspeaker have shown a significant variation of apparent resistance and inductance of the voice-coil with current, displacement, and frequency. Conversely, the measurements on the magnet-only loudspeaker have shown almost no variation whatever the frequency, current level, or position of the voice-coil.

Occupancy-Based Analysis and Interpretation of Soundscape Auditory Complexity: Case of a Campus Restaurant

Authors: Ghozi, Raja; Fraj, Olfa; Salem, Mohsen Bel-haj; Jaidane, Meriem

Audio scene analysis and soundscape perception have been of interest to many researchers in room acoustics, urban sound design, and audio processing. This article presents an analysis and interpretation of soundscape perceptions of a public confined eating space. Based on a realistic setting of a campus restaurant, a 220-subject survey was conducted where various auditory complexity perceptions were queried as the space went from empty to full occupancy. The responses to five aspects of auditory complexity perception are investigated via an on-site survey: the level of auditory attention, soundscape complexity, sound nuisance due to objects, that due to human voices, and the ease of carrying on a discussion in the restaurant. The audio recordings of the changing restaurant soundscape were examined for objective signal complexity changes, via first- and second-order entropy dynamics, in a search for tendencies that align with the subjective study. Nonmonotic changes of these perceptions were noted with increasing occupancy levels. Various physical, spatial, and human factors, which are in constant interaction to form a composite experience and therefore a complex soundscape, all of which play a contributing role.

Standards and Information Documents

AES Standards Committee News


Studio Acoustical Design

Authors: Rumsey, Francis

[Feature] Workshops at the 137th Convention looked into both the history of control room acoustic fashions and the challenges of “finding a good acoustic space” faced by engineers starting out in business today.

61st Conference, Call for Papers


Products and Developments

AES Conventions and Conferences


Table of Contents

Cover & Sustaining Members List

AES Officers, Committees, Offices & Journal Staff

Institutional Subscribers: If you would like to log into the E-Library using your institutional log in information, please click HERE.

Choose your country of residence from this list:

Skip to content