Journal of the Audio Engineering Society

2022 January/February - Volume 70 Number 1/2


A Speech Enhancement Method Based on the Combination of Microphone Array and Parabolic Reflector

Authors: Geng, Yanzhang; Zhang, Tao; Yaw, Mensah Samuel; Wang, Heng

Speech enhancement is an essential aspect of the field of speech processing research. In most cases the performance of back-end speech technology (such as speech recognition) depends on the quality of speech enhancement output. One typical multi-channel speech enhancement method is microphone array beamforming. However the beamforming performance decreases when it works in a low Signal-to-Noise Ratio (SNR) environment. We propose a speech enhancement method called Paraboloid with Microphone Array (PMA) to improve the microphone array performance. The PMA is a combination enhancement method. It combines a beamforming speech enhancement modular and an acoustic enhancement modular achieved by a paraboloid. This method can be described as follows: (1) The target signal is enhanced by the microphone array and the acoustic focusing method, which we achieved by attaching a circular microphone array to a paraboloid. These two methods enhance the signal from different perspectives, thus making the enhanced signals complementary. (2) We employ the Independent Component Analysis (ICA) method to combine the output from the two abovementioned methods, which achieved the speech signal's secondary enhancement. In this study we analyze the speech enhancement property of the parabolic quantitatively. Computer simulation shows that the proposed method performs well in the low Signal-to-Noise Ratio (SNR) environment. The real-world experimental results are similar to the computer simulation results. Besides, the subjective experiments also verify the feasibility of the proposed method.


It is desirable that the measured acoustic impulse response has constant normalized noise power (NNP) in all frequency bands. However the conventional measurement signals aimed at achieving this property were derived intuitively, and the theoretical background is insufficient. In this work we first theoretically derived the relational formula that the measurement signals must satisfy for the measured impulse response to have constant NNP over all frequency bands. This formula includes all the measurement signals that achieve constant NNP. We then found the shortest (equivalently, the minimum energy) measurement signal among them. We call this signal the bandwise minimum noise (BMN) signal. Experiments to measure the room impulse responses were carried out. The experimental results confirmed that the impulse responses measured by the BMN signal had almost constant NNP in all frequency bands. Also, it was confirmed that the BMN signal achieved the required NNP for reverberation time measurement with the shortest signal length as compared with the conventional measurement signals.

Non-ideal conditions in actual transaural reproduction, such as a slightly off-central listening position and reflections in the listening room, can impair the perfect reconstruction of binaural pressures and give rise to perceived timbre coloration. In the present work, a high-frequency band equalization method is proposed to further reduce the timbre coloration in transaural reproduction with two frontal loudspeakers. The high-frequency responses of a pair of transaural filters are equalized by a frequency-dependent factor so that the overall power spectra of the responses remain constant, and the low-frequency responses of transaural filters are kept intact. An analysis using Moore's revised loudness model indicates that the proposed method reduces the deviation between the binaural loudness level spectra in transaural reproduction and those of the target sound source. A further psychoacoustic experiment validates that the proposed method reduces the timbre coloration in transaural reproduction without introducing an obvious perceivable directional distortion for virtual source in the frontal quadrants of the horizontal plane.

Laser-Sound Transduction From Digital ΣΔ Streams

Authors: Kaleris, Konstantinos; Stelzner, Bjoern; Hatziantoniou, Panagiotis; Trimis, Dimosthenis; Mourjopoulos, John

In this work, a novel optoacoustic transducer prototype capable of reproducing continuous sound waves from single or multi-bit ∑∆ digital audio streams is presented. The prototype is based on a pulsed nanosecond laser that generates acoustic N-waves via Laser-Induced Breakdown in air or indirect sound via the laser ablation effect on solid targets. Technical aspects of the prototype platform are presented, with emphasis on the laser control system and audio signal–processing techniques deployed for the modulation of the pulsed laser radiation. Experimental results from acoustic measurements of reproduced test audio signals are presented within the frequency range allowed by the specifications of the specific laser. The system’s audio performance characteristics are derived along with its impulse and frequency responses. The experimental results are compared with simulations of the optoacoustic transducer’s response via a computational model, showing good agreement.

Sound Level Monitoring at Live Events, Part 2---Regulations, Practices, and Preferences

Authors: Mulder, Johannes; Hill, Adam J.; Burton, Jon; Kok, Marcel; Lawrence, Michael


This paper considers existing regulations, practices, and preferences regarding the measurement, monitoring, and management of sound levels at live music events. It brings together a brief overview of current regulations with the outcomes of a recent international survey of live sound engineers and evaluation of three datasets of sound measurement at live music events. The paper reveals the benefit of a 15-min time frame for the definition of equivalent continuous sound level limits in comparison to longer or shorter time frames. The paper also reveals support from the live sound engineering community for the application of sound level limits and development of a global certification system for live sound engineers.

Sound Level Monitoring at Live Events, Part 3--Improved Tools and Procedures

Authors: Hill, Adam J.; Mulder, Johannes; Burton, Jon; Kok, Marcel; Lawrence, Michael


This is the final installment in a series of three papers looking into the subject of sound level monitoring at live events. The first two papers revealed how practical shortcomings and audience and neighbor considerations (in the form of sound level limits) can impact the overall live experience. This paper focuses on an improved set of tools for sound engineers to ensure a high-quality and safe live event experience while maintaining compliance with local sound level limits. This includes data processing tools to predict future limit violations and guidelines for improved user interface design. Practical procedures, including effective sound level monitoring practice, alongside resourceful mixing techniques are presented to provide a robust toolset that can allow sound engineers to perform their best without compromising the listening experience in response to local sound level limits.

Engineering reports

The Internet of Things (IoT) is fostering advancements in the embedded systems world, widening the range of available single-board computers and lowering their price. The Internet of Musical Things (IoMusT), the IoT musical counterpart, is thriving as well with more and more examples of embedded devices useful to build connected musical interfaces. For this purpose, real-time architectures based on the Linux operating system are increasingly used. In this paper, we compare two radically different approaches to real-time Linux audio: one system is based on the PREEMPT RT patch and the ALSA framework and the other on the Xenomai patch and the Elk Audio OS. Our study aims at providing audio developers working on IoMusT devices and applications with a clear quantitative picture of how these two systems compare. Our results reveal that Xenomai provides lower audio round-trip latency, lower scheduling latency, and manages to exploit more CPU performance at a given latency setting while guaranteeing perfect audio quality. Nevertheless, PREEMPT RT still delivers good performance, and it is widely supported resulting in a more accessible alternative. All the tests have been carried out on the Raspberry Pi 4B single-board computer combined with the HiFiBerry expansion HAT.

Standards and Information Documents

AES Standards Committee News


Call for Papers

AES New Officers

AES Financial Report



Table of Contents

Cover & Sustaining Members List

AES Officers, Committees, Offices & Journal Staff

Institutional Subscribers: If you would like to log into the E-Library using your institutional log in information, please click HERE.

Choose your country of residence from this list:

Skip to content