You are currently logged in as an
Institutional Subscriber.
If you would like to logout,
please click on the button below.
Home / Publications / E-library page
Only AES members and Institutional Journal Subscribers can download
The deployment of machine listening algorithms in real-world application scenarios is challenging. In this paper, we investigate how the superposition of multiple sound events within complex sound scenes affects their recognition. As a basis for our research, we introduce the Urban Sound Monitoring (USM) dataset, which is a novel public benchmark dataset for urban sound monitoring tasks. It includes 24,000 sound scenes that are mixed from isolated sounds using different loudness levels, sound polyphony levels, and stereo panorama placements. In a benchmark experiment, we evaluate three deep neural network architectures for sound event tagging (SET) on the USM dataset. In addition to counting the overall number of sounds in a sound scene, we introduce a local sound polyphony measure as well as a temporal and frequency coverage measure of sounds which allow to characterize complex sound scenes. The analysis of these measures confirms that SET performance decreases for higher sound polyphony levels and larger temporal coverage of sounds.
Author (s): Abeßer, Jakob
Affiliation:
Fraunhofer IDMT, Ilmenau, Germany
(See document for exact affiliation information.)
AES Convention: 152
Paper Number:10570
Publication Date:
2022-05-06
Import into BibTeX
Session subject:
Sound Classification
Permalink: https://aes2.org/publications/elibrary-page/?id=21683
(683KB)
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.
Abeßer, Jakob; 2022; Classifying Sounds in Polyphonic Urban Sound Scenes [PDF]; Fraunhofer IDMT, Ilmenau, Germany; Paper 10570; Available from: https://aes2.org/publications/elibrary-page/?id=21683
Abeßer, Jakob; Classifying Sounds in Polyphonic Urban Sound Scenes [PDF]; Fraunhofer IDMT, Ilmenau, Germany; Paper 10570; 2022 Available: https://aes2.org/publications/elibrary-page/?id=21683