You are currently logged in as an
Institutional Subscriber.
If you would like to logout,
please click on the button below.
Home / Publications / E-library page
Only AES members and Institutional Journal Subscribers can download
The introduction and regulation of loudness in broadcasting and streaming brought clear benefits to the audience, e.g., a level of uniformity across programs and channels. Yet, speech loudness is frequently reported as being too low in certain passages, which can hinder the full understanding and enjoyment of movies and TV programs. This paper proposes expanding the set of loudness-based measures typically used in the industry. We focus on speech loudness, and we show that, when clean speech is not available, Deep Neural Networks (DNNs) can be used to isolate the speech signal and so to accurately estimate speech loudness, providing a more precise estimate compared to speech-gated loudness. Moreover, we define critical passages, i.e., passages in which speech is likely to be hard to understand. Critical passages are defined based on the local Speech Loudness Deviation (SLD) and the local Speech-to-Background Loudness Difference (SBLD), as SLD and SBLD significantly contribute to intelligibility and listening effort. In contrast to other more comprehensive measures of intelligibility and listening effort, SLD and SBLD can be straightforwardly measured, are intuitive, and, most importantly, can be easily controlled by adjusting the speech level in the mix or by enabling personalization at the users end. Finally, examples are provided that show how the detection of critical passages can support the evaluation and control of the speech signal during and after content production.
Author (s): Torcoli, Matteo; Halimeh, Mhd Modar; Leitz, Thomas; Grewe, Yannik; Kratschmer, Michael; Murtaza, Adrian; Fuchs, Harald; Habets, Emanuel; Neugebauer, Bernhard
Affiliation:
Fraunhofer IIS; DSP Solutions; Fraunhofer IIS; Fraunhofer IIS; Fraunhofer IIS; Fraunhofer IIS; Fraunhofer IIS; Fraunhofer IIS; Fraunhofer IIS
(See document for exact affiliation information.)
AES Convention: 156
Paper Number:10698
Publication Date:
2024-06-06
Import into BibTeX
Permalink: https://aes2.org/publications/elibrary-page/?id=22511
(14430KB)
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.
Torcoli, Matteo; Halimeh, Mhd Modar; Leitz, Thomas; Grewe, Yannik; Kratschmer, Michael; Murtaza, Adrian; Fuchs, Harald; Habets, Emanuel; Neugebauer, Bernhard; 2024; Speech Loudness in Broadcasting and Streaming [PDF]; Fraunhofer IIS; DSP Solutions; Fraunhofer IIS; Fraunhofer IIS; Fraunhofer IIS; Fraunhofer IIS; Fraunhofer IIS; Fraunhofer IIS; Fraunhofer IIS; Paper 10698; Available from: https://aes2.org/publications/elibrary-page/?id=22511
Torcoli, Matteo; Halimeh, Mhd Modar; Leitz, Thomas; Grewe, Yannik; Kratschmer, Michael; Murtaza, Adrian; Fuchs, Harald; Habets, Emanuel; Neugebauer, Bernhard; Speech Loudness in Broadcasting and Streaming [PDF]; Fraunhofer IIS; DSP Solutions; Fraunhofer IIS; Fraunhofer IIS; Fraunhofer IIS; Fraunhofer IIS; Fraunhofer IIS; Fraunhofer IIS; Fraunhofer IIS; Paper 10698; 2024 Available: https://aes2.org/publications/elibrary-page/?id=22511
@article{torcoli2024speech,
author={torcoli matteo and halimeh mhd modar and leitz thomas and grewe yannik and kratschmer michael and murtaza adrian and fuchs harald and habets emanuel and neugebauer bernhard},
journal={journal of the audio engineering society},
title={speech loudness in broadcasting and streaming},
year={2024},
number={10698},
month={may},}
TY – paper
TI – Speech Loudness in Broadcasting and Streaming
AU – Torcoli, Matteo
AU – Halimeh, Mhd Modar
AU – Leitz, Thomas
AU – Grewe, Yannik
AU – Kratschmer, Michael
AU – Murtaza, Adrian
AU – Fuchs, Harald
AU – Habets, Emanuel
AU – Neugebauer, Bernhard
PY – 2024
JO – Journal of the Audio Engineering Society
VL – 10698
Y1 – May 2024