Journal of the Audio Engineering Society

2021 December - Volume 69 Number 12


Audio delay is a crucial factor in the Quality of Musicians' Experience (QoME) in Network Music Performance. Previous studies have explored the tolerance of musicians to delay and its dependence on the timbre of the instruments used and tempo of the performance. Although their findings are intriguing, the small size of these studies makes the extraction of concrete conclusions quite hard. In order to shed more light on these issues, we undertook a larger-scale Network Music Performance study with real musicians, assessing a wide range of subjective QoME variables against delay, and correlated these results with the audio characteristics of the instruments and performance. Because of the large number of participants, our findings validate and extend previous studies with a wider array of QoME variables and audio characteristics.

This paper describes an immersive audio Network Music Performance (NMP) system designed for group singing. A prototype of this design (audio only) was deployed to ten singers across Europe, who participated in a duet vocal performance study, operating the system from their home networks. Parametric evaluation of these vocal performances was conducted in order to provide characterization of musical interactivity between performers and explore the challenges and opportunities presented for immersive audio NMP systems in practical usecase settings. Results demonstrate that it is possible to achieve performance that conforms to expectations of live interactivity and estimate the conditions under which this may be achieved. Significant effect of latency, and in one case virtual room "type," is observed across performances. Informal questionnaire responses present discussion of the potential for virtual acoustics and latency to impact the perceptual experience of networked performers.

The recent lockdown restrictions imposed by the severe acute respiratory syndrome coronavirus 2 pandemic have heightened the need for new forms of remote collaboration for music schools, conservatories, musician ensembles, and artists, each of which would benefit from being provided with adequate tools to make high-quality, live collaborative music in a distributed fashion. This paper demonstrates the usage of the Networked Music Performance software JackTrip to support a distributed classical concert involving singers and musicians from four different locations in two continents, using readily available hardware/software solutions and internet connections while guaranteeing high-fidelity audio quality. This paper provides a description of the technical setup with a numerical analysis of the achieved mouth-to-ear latency and assessment of the music-making experience as perceived by the performers.

Smart musical instruments (SMIs) are an emerging category of musical instruments characterized by sensors, actuators, wireless connectivity, and embedded intelligence. To date, a topic that has received remarkably little attention in SMIs research is that of defining a file format for the offline exchange of content produced by such instruments. To address this gap, in this paper we propose the Smart Musical Instruments Format (SMIF), a file format specific to smart musical instruments. We also provide an implementation of an encoder, decoder, and player for it. Such a format is not completely fitting any current standard but is strongly inspired by the MPEG-A: Interactive Music Application Format (IM AF). In our implementation we integrated IM AF with tracks related to sensors, MIDI, as well as the representation of the instrument's sound engine via the Smart Musical Instruments Ontology. SMIF enables the creation of novel applications for the offline exchange of SMIs configuration and data, some of which are illustrated in the paper.

Design Recommendations for a Collaborative Game of Bird Call Recognition Based on Internet of Sound Practices

Authors: Rovithis, Emmanouel; Moustakas, Nikolaos; Vogklis, Konstantinos; Drossos, Konstantinos; Floros, Andreas

Citizen Science aims to engage people in research activities on important issues related to their well-being. Smart Cities aim to provide them with services that improve the quality of their life. Both concepts have seen significant growth in the last years and can be further enhanced by combining their purposes with Internet of Things technologies that allow for dynamic and large-scale communication and interaction. However, exciting and retaining the interest of participants is a key factor for such initiatives. In this paper we suggest that engagement in Citizen Science projects applied on Smart Cities infrastructure can be enhanced through contextual and structural game elements realized through augmented audio interactive mechanisms. Our inter-disciplinary framework is described through the paradigm of a collaborative bird call recognition game, in which users collect and submit audio data that are then classified and used for augmenting physical space. We discuss the Playful Learning, Internet of Audio Things, and Bird Monitoring principles that shaped the design of our paradigm and analyze the design issues of its potential technical implementation.

Acoustic direction of arrival estimation methods allows positional information about sound sources to be transmitted over a network using minimal bandwidth. For these purposes,methods that prioritize low computational overhead and consistent accuracy under non-ideal conditions are preferred. The estimation method introduced in this paper uses a set of steered beams to estimate directional energy at sparsely distributed orientations around a spherical microphone array. By iteratively adjusting beam orientations based on the orientation of maximum energy, an accurate orientation estimate of a sound source may be produced with minimal computational cost. Incorporating conditions based on temporal smoothing and diffuse energy estimation further refines this process. Testing under simulated conditions indicates favorable accuracy under reverberation and source discrimination when compared with several other contemporary localization methods. Outcomes include an average localization error of less than 10? under 2 s of reverberation time (T60) and the potential to separate up to four sound sources under the same conditions. Results from testing in a laboratory environment demonstrate potential for integration into real-time frameworks.

Standards and Information Documents

AES Standards Committee News


Much of the current research and development in the field of loudspeaker transducers is concerned with the challenges of how to make them both smaller and louder. In some cases this involves novel mechanical designs, and in others it involves the use of signal processing to enable existing designs to be pushed closer and closer to their limits. In relation to the directional characteristics of transducers in different contexts, two studies are reported here. One concerns the effect of a user’s hands on the directional radiation of a mobile phone speaker, and the other deals with the directionality and noise pollution resulting from different configurations of subwoofers in a sound reinforcement application.

Call for Nominations

Index to Vol 69



Table of Contents

Cover & Sustaining Members List

AES Officers, Committees, Offices & Journal Staff

Institutional Subscribers: If you would like to log into the E-Library using your institutional log in information, please click HERE.

Choose your country of residence from this list:

Skip to content