AES E-Library

S^3MASH: Spatial Sound Scene Matching using Single-Channel Audio

This paper describes a novel approach for recording and binaurally reproducing spatial sound scenes using the audio from a single microphone. This is realised by recording the sound scene using both a microphone array, which potentially comprises more affordable and lower quality capsules, and a monophonic microphone, possibly featuring a higher quality capsule. By adopting a perceptually motivated sound-field model and estimating the models spatial parameters, it is possible to define target time-frequency-dependent binaural spatial covariance matrices (SCMs). The actual binaural signals can then be synthesised using an adaptive SCM matching renderer, which takes only the higher-quality monophonic audio signal as input. A perceptual study was conducted to compare this novel processing approach, using a tetrahedral array and an omnidirectional microphone, against binaural renderings achieved through traditional Ambisonic means, when using four- and 32-channel arrays. The results show that, despite utilising only a monophonic signal for the spatialisation, the proposed approach yielded binaural renderings that are perceptually in-between the two conventional Ambisonic array renderings, with regards to their perceived spatial accuracy.

 

Author (s):
Affiliation: (See document for exact affiliation information.)
Publication Date:
Session subject:
Permalink: https://aes2.org/publications/elibrary-page/?id=22664


(4546KB)


Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.

Type:
E-Libary location:
16938
Choose your country of residence from this list:










Skip to content