Journal of the Audio Engineering Society

2012 October - Volume 60 Number 10


Acoustics and Modeling of Pickups

Authors: Paiva, Rafael C. D.; Pakarinen, Jyri; Välimäki, Vesa

In order to synthesize steel-stringed instruments, such as a guitar, a model of the pickup phenomenon is required. This model includes the pickup position, the sensitivity width of the transducer, mixing options with multiple pickups, linear resonant filtering, and distortion produced by the distance-dependent magnetic flux. A waveguide framework was used to describe frequency coloration of the pickup location and the low-pass effect of sensitivity width. The resulting models can be used in musical sound synthesis and digital effects. The physical properties of the pickup transducer modify the timbre of the instrument in various ways. For implementing audio effects for real guitar signals, hexaphonic pickups for separate signal streams for each string would be needed. Several commercial implementations of such pickup systems currently exist.

Variability in Perceptual Evaluation of HRTFs

Authors: Schönstein, David; Katz, Brian F.G.

Because appropriate head-related transfer functions (HRTFs) are key to binaural rendering, an evaluation is required to assess processing steps when individual HRTFs are not available. This study involving six subjects showed significant response variability in perceptual evaluations of HRTFs when subjects were asked to judge six sets of HRTFs, including individual HRTFs, with three different attributes. Insufficient reproducibility is problematic when trying to select nonindividual HRTFs. In order to minimize the effect of learning, adequate training should be provided. By using attribute evaluations and assessor selection, this study offers a methodology that might be used to produce consistent evaluations in commercial binaural syntheses.

The expanding use of portable multimedia devices has intensified the need for better forms of scalable spatial audio coding (SAC) that match the connectivity rate and multichannel playback capabilities of the receiving device. A new SAC method is based on the parameterization of multichannel audio by representing it as a linear combination of objects composed of fixed spectral bases with time-varying gain and channel-dependent spatial gain. Spatial parameters can be estimated from the original multichannel signal using psychoacoustic properties of sound source localization. The base audio can be monophonic or downmixed stereophonic. Listening tests showed that the proposed SAC algorithm achieved the performance of conventional spatial audio coding methods with similar bit rates. The sound separation performance was evaluated and found applicable for separating sound sources in the coding domain directly.

All-Round Ambisonic Panning and Decoding

Authors: Zotter, Franz; Frank, Matthias


The ideal panning algorithm for creating virtual locations in surround sound would have small variations in energy as the target locations change. All-Round Ambisonic Panning (AllRAP) is an algorithm that aims for creating phantom sources of stable loudness and adjustable width for arbitrary loudspeaker arrangements. This is achieved by combining an extended version of Vector-Base Amplitude Panning (VBAP) with Ambisonics. Ambisonics as audio format needs to be decoded to loudspeakers, which conventionally requires either dedicated loudspeaker arrangements or sophisticated mathematical treatment. In contrast, the proposed panning and decoding algorithm is highly generic and easily applicable to any arrangement of loudspeakers and platform with only basic computational capacities. Because AllRAP also works with loudspeaker arrangements covering only a part of a sphere, it is suitable for upcoming surround with height formats.

On the Improvement of Localization Accuracy with Non-Individualized HRTF-Based Sounds

Authors: Mendonça, Catarina; Campos, Guilherme; Dias, Paulo; Vieira, José; Ferreira, João P.; Santos, Jorge A.

Even though individual head-related transfer function (HRTF) filters produce better performance in virtual-reality environments, measuring individuals is labor intensive and expensive. Can training be used to enhance the performance of generic filters? This research shows that short training sessions with feedback allows for perceptual adaptation where simple exposure to generic HRTF filters did not. The benefits of training were observed not only for the trained sounds but also for other stimulus positions that were not part of the training. Apparently, subjects were actually adapting and generalizing to the generic HRTF filters, which is a manifestation of sensory neural plasticity. Learning profiles are unique to individuals. Any testing of localization performance should recognize the influence of training.

H-Semantics: A Hybrid Approach to Singing Voice Separation

Authors: Sofianos, Stratis; Ariyaeeinia, Aladdin; Polfreman, Richard; Sotudeh, Reza

Separating the singing voice from accompanying instruments is important in music information-retrieval systems, since it allows for such applications as melody extraction, lyrics recognition, and singer identity. The authors investigate effective methods for unsupervised separation of the singing voice, called H-Semantics (Hybrid Singing Extraction through Multiband Amplitude Enhanced Thresholding and Independent Component Subtraction). The proposed method adds time-domain separation to the previous work that was based on frequency-domain cepstral methods. The results indicate separation of approximately 8.5 dB signal-to-distortion ratio over the baseline.

Standards and Information Documents

AES Standards Committee News


[Feature] Analysis of the electric network frequency (ENF) has rapidly emerged as a crucial tool in the armory of the forensic audio analyst. Traces of the ENF are often picked up on recordings, either by electromagnetic induction or acoustically, and these can be detected subsequently by analysts. It turns out that unique patterns in the frequency signature of power-line-related signals can be used to identify the time and place in which audio recordings might have been made. This is possible because some power grid companies keep records of the ENF in a database, against which forensic audio samples can be compared. Even if no such database is available, which is sometimes the case, analysis of residual traces of the ENF in a recording can be used to detect audio editing and other processes that might have been used to modify it. During the 46th International Conference, held recently in Denver, Colorado, a substantial number of the papers and posters were devoted to aspects of ENF analysis, a selection of which are summarized here.

47th Conference Report, Chicago

Review of Sustaining Members

134th Call for Papers and Engineering Briefs, Rome

52nd Call for Contributions, Guildford


Advertiser Internet Directory

Products and Developments

AES Conventions and Conferences


Table of Contents

Cover & Sustaining Members List

AES Officers, Committees, Offices & Journal Staff

Institutional Subscribers: If you would like to log into the E-Library using your institutional log in information, please click HERE.

Choose your country of residence from this list:

Skip to content