Home / Publications / E-library page
Only AES members and Institutional Journal Subscribers can download
This paper presents a novel approach for estimating the octave-band Energy Decay Curve (EDC) from noisy reverberant speech signals using Generative Adversarial Networks (GANs). Traditional methods of room acoustic analysis often rely on parameters such as Reverberation Time (RT) or Clarity (C50), or direct estimation of the Room Impulse Response (RIR). However, these approaches can be limited by their dependence on controlled experimental setups and/or detailed knowledge of the acoustic environment. By contrast, the proposed method leverages the power of GANs to directly estimate the sub-band EDCs from noisy reverberant speech signals, providing a more comprehensive characterization of the room`s acoustic properties compared to the single-valued parameters. The proposed method offers several advantages over traditional approaches as the EDC provides a more detailed and holistic view of the room`s acoustic behaviour compared to single-parameter metrics like RT as well as overcomes the artefacts that comes with generating RIRs directly. It provides a middle-ground that is more useful in synthesizing similar reverberation as of the real RIR. To evaluate the effectiveness of our approach, we conducted a series of experiments comparing the proposed GAN-based EDC estimation to state-of-the-art models on well-known benchmarks. The results demonstrate that our method not only achieves superior performance in terms of accuracy but also shows robustness to variations sub-band reverberation characteristics. The ability to estimate the EDC directly from speech signals without requiring a priori knowledge of the room highlights the practical applicability of our approach in real-world scenarios.
Author (s): Saini, Shivam; Peissig, Jürgen
Affiliation:
Leibniz Universität Hannover, Institut für Kommunikationstechnik, Hanover, Germany, and Huawei Munich Research Center, Munich, Germany; Leibniz Universität Hannover, Institut für Kommunikationstechnik, Hanover, Germany
(See document for exact affiliation information.)
Publication Date:
2024-08-05
Import into BibTeX
Session subject:
Audio for Virtual and Augmented Reality
Permalink: https://aes2.org/publications/elibrary-page/?id=22682
(858KB)
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.
Saini, Shivam; Peissig, Jürgen; 2024; Leveraging GANs for a better Blind Room Identification through Energy Decay Curve Estimation [PDF]; Leibniz Universität Hannover, Institut für Kommunikationstechnik, Hanover, Germany, and Huawei Munich Research Center, Munich, Germany; Leibniz Universität Hannover, Institut für Kommunikationstechnik, Hanover, Germany; Paper 33; Available from: https://aes2.org/publications/elibrary-page/?id=22682
Saini, Shivam; Peissig, Jürgen; Leveraging GANs for a better Blind Room Identification through Energy Decay Curve Estimation [PDF]; Leibniz Universität Hannover, Institut für Kommunikationstechnik, Hanover, Germany, and Huawei Munich Research Center, Munich, Germany; Leibniz Universität Hannover, Institut für Kommunikationstechnik, Hanover, Germany; Paper 33; 2024 Available: https://aes2.org/publications/elibrary-page/?id=22682