You are currently logged in as an
Institutional Subscriber.
If you would like to logout,
please click on the button below.
Home / Publications / E-library page
Only AES members and Institutional Journal Subscribers can download
Binaural Sound Source Localisation is increasingly being achieved by means of the Convolutional Neural Network (CNN). These networks take in a Time-Frequency representation of audio as an input, and use this to estimate the direction of arrival of a sound. In previous works, different Time-Frequency representations have been used, but never only using solely magnitude spectra, leading to a lack of understanding in the importance of this in full azimuthal binaural sound source localisation. This work aims to address that gap by testing the performance of a CNN trained and tested on four different Time-Frequency representations: Mel-Spectrogram, Gammatonegram, Mel-Frequency Cepstrum, and Gammatone-Frequency Cepstrum. From this test, it was found that Spectrograms are suitable for the task of full azimuthal binaural sound source localisation.
Author (s): Reed-Jones, Jago T.; Jones, Karl O.; Fergus, Paul; Marsland, John; Ellis, David L.
Affiliation:
Liverpool John Moores University, Liverpool, UK; Liverpool John Moores University, Liverpool, UK; Liverpool John Moores University, Liverpool, UK; Liverpool John Moores University, Liverpool, UK; Liverpool John Moores University, Liverpool, UK
(See document for exact affiliation information.)
AES Convention: 154
Paper Number:59
Publication Date:
2023-05-06
Import into BibTeX
Session subject:
Spatial Audio
Permalink: https://aes2.org/publications/elibrary-page/?id=22084
(302KB)
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.
Reed-Jones, Jago T.; Jones, Karl O.; Fergus, Paul; Marsland, John; Ellis, David L.; 2023; Comparison of Performance in Binaural Sound Source Localisation using Convolutional Neural Networks for differing Feature Representations [PDF]; Liverpool John Moores University, Liverpool, UK; Liverpool John Moores University, Liverpool, UK; Liverpool John Moores University, Liverpool, UK; Liverpool John Moores University, Liverpool, UK; Liverpool John Moores University, Liverpool, UK; Paper 59; Available from: https://aes2.org/publications/elibrary-page/?id=22084
Reed-Jones, Jago T.; Jones, Karl O.; Fergus, Paul; Marsland, John; Ellis, David L.; Comparison of Performance in Binaural Sound Source Localisation using Convolutional Neural Networks for differing Feature Representations [PDF]; Liverpool John Moores University, Liverpool, UK; Liverpool John Moores University, Liverpool, UK; Liverpool John Moores University, Liverpool, UK; Liverpool John Moores University, Liverpool, UK; Liverpool John Moores University, Liverpool, UK; Paper 59; 2023 Available: https://aes2.org/publications/elibrary-page/?id=22084
@article{reed-jones2023comparison,
author={reed-jones jago t. and jones karl o. and fergus paul and marsland john and ellis david l.},
journal={journal of the audio engineering society},
title={comparison of performance in binaural sound source localisation using convolutional neural networks for differing feature representations},
year={2023},
number={59},
month={may},}
TY – paper
TI – Comparison of Performance in Binaural Sound Source Localisation using Convolutional Neural Networks for differing Feature Representations
AU – Reed-Jones, Jago T.
AU – Jones, Karl O.
AU – Fergus, Paul
AU – Marsland, John
AU – Ellis, David L.
PY – 2023
JO – Journal of the Audio Engineering Society
VL – 59
Y1 – May 2023