You are currently logged in as an
Institutional Subscriber.
If you would like to logout,
please click on the button below.
Home / 2025 AES International Conference on Artificial Intelligence and Machine Learning for Audio Program
Schedule is subject to change.
View the listing of Accepted Papers or Panels and Tutorials that will be slotted in the schedule.
8:00 AM – 9:20 AM
Registration & Coffee Break
Welcome Ceremony
9:40 AM – 10:40 AM
12:00 PM – 1:40 PM
1:40 PM – 2:40 PM
8:00 AM – 9:00 AM
Workshops & Paper Sessions
5:30 PM – 11:00 PM
Optional Social Event: Private Evening Cruise on the Elizabethan
Registration
1:40 PM – 2:40 PM
Final Jam
September 8 – 9:20am-10:00am
Real-Time Neural Audio Inference
F Caspe & Jatin Chowdhury
September 8 – 10:00am-10:40am
How to Generate Large-Scale Room-Acoustic Datasets: Leveraging State-of-the-Art Simulation Methods to Improve Downstream Performance of Data-Driven Approaches
Georg Götz and Finnur Pind
September 8 – 4:00pm-5:00pm
Privacy for Audio AI: Risks, Challenges, and Emerging Solutions in the Era of Audio AI
Thomas Deacon, Jennifer Williams, Jason R. C. Nurse, Christopher Hicks, Gabriel Bibbó, Arshdeep Singh, and Mark D. Plumbley
September 10 – 3:20pm-4:20pm
Neural Audio Coding Techniques and Their Evaluation
Pablo M Delgado, Jürgen Herre, Jan Skoglund, Stéphane Ragot, and Julian Parker
A Listener-Evaluated Dataset of Amateur Karaoke Singing and Audiobook Narration
Elena Georgieva, Pablo Ripollés and Brian McFee
A Machine Learning Approach to Modal Control in Small Rooms.
Carlo Bolla, Trevor Cox and Bruno Fazenda
A Scalable AI Architecture for Audio and Multimodal Analysis on Mobile Devices: A Case of Environmental Monitoring
Marina Eirini Stamatiadou, Athanasia Mpesmerti, Nikolaos Vryzas, Lazaros Vrysis and Charalampos Dimoulas
Adaptive Neural Audio Mixing with Human-in-the-Loop Feedback: A Reinforcement Learning Approach
Shanshan Zhu and Mohammad Nasim
AudioGAN: A Compact and Efficient Framework for Real-Time High-Fidelity Text-to-Audio Generation
Haechun Chung
Automatic Audio Equalization with Semantic Embeddings
Eloi Moliner, Vesa Välimäki, Konstantinos Drosos and Matti Hämäläinen
Broadcast-Quality Synthetic Narration: A Workflow for Fine-Grained Text-to-Speech Intonation and Emotion Control
Luiz Fernando Kruszielski, Pedro H.L. Leite, Myllene P. Fernandes, Andre Pereira and Luiz W. P. Biscainho
Challenges in Predicting the Lyric Intelligibility of Musical Segments for Older Individuals with Hearing Loss
William M. Whitmer, David McShefferty, Michael A. Akeroyd, Scott C. Bannister, Jon P. Barker, Trevor J. Cox, Bruno M. Fazenda, Jennifer Firth, Simone N. Graetzer, Alinka E. Greasley, Gerardo Roa Dabike and Rebecca Vos
Complex-Valued Physics-Informed Neural Networks for Sound Field Estimation
Vlad-Stefan Paul, Nara Hahn and Philip Nelson
Compressing Neural Network Models of Audio Distortion Effects Using Knowledge Distillation Techniques
Riccardo Simionato and Aleksander Tidemann
Compression of Higher Order Ambisonics with Multichannel RVQGAN
Toni Hirvonen and Mahmoud Namazi
Establishing a Virtual Listener Panel for Audio Characterisation
Michelle Herlufsen, Niels Asp Fuglsang and Benjamin Pedersen
Extraction and Neural Synthesis of Timbre for Head-Related Transfer Functions
Mary Pilataki, Chris Buchanan and Cal Armstrong
Faust Autodiff: Towards Audio Domain-Specific Machine Learning
Thomas Rushton, Yann Orlarey, Romain Michon, Tanguy Risset and Stéphane Letz
Flute Tone Quality Classification: a Machine-Learning-Based Instructional Tool
Nikita Sane and Jonathan Abel
From CNN to Reservoir Computing: A New Perspective on Acoustic Scene Classification
Yuxuan He, Alireza Molla Ali Hosseini, Jakob Abeßer, Lina Jaurigue, Alexander Raake and Kathy Lüdge
Hybrid Learning-based Active Noise Control in Encapsulated Structures
Alkahf Alkahf, Hamid Reza Karimi and Francesco Ripamonti
Improved Singing Voice Conversion with Frame-Level Content and Melody-Informed Speaker Embeddings Using Cross-Attention
Jih-Wei Yeh, Elaine M. Liu and Yi-Wen Liu
Improvement and Cross Domain Evaluation of Slow-Fast-Networks
Ravi Kumar, Sascha Grollmisch and Jakob Abeßer
Integrating IP Broadcasting with Audio Tags: Workflow and Challenges
Rhys Burchett-Vass, Arshdeep Singh, Gabriel Bibbó and Mark D. Plumbley
It‘s All About Speed: AI‘s Impact on Workflow in Music Production
Finn McClellan and Fabio Morreale
Motor2Synth: Leveraging Differential Digital Signal Processing for Generating Combustion Engine Sounds Compatible with Active Sound Design Frameworks
Thiago Henrique Gomes Lobato, Stefan Hank, Hanyi Zhang and Haofu Luo
Multiple Loudspeaker Localization with Simultaneous Deconvolution
Sunil Bharitkar and Adrian Celestinos
NablAFx: A Framework for Differentiable Black-box and Gray-box Modeling of Audio Effects
Marco Comunità, Christian Steinmetz and Josh Reiss
Neutone SDK: An Open Source Framework for Neural Audio Processing
Christopher Mitcheltree, Bogdan Teleaga, Andrew Fyfe, Naotake Masuda, Matthias Schäfer, Alfie Bradic and Nao Tokui
Perceiving AI in Music: Human Evaluation of AI-Generated Melodies and AI Detection Sensitivity
Michael Oehler, Jasper Oldach and Florian Zwißler
Perceptions of an Artificial Intelligence Musical Collaborator
Becky Allen and Ronald Mo
Predicting Binaural Colouration using VGGish Embeddings
Thomas McKenzie, Alec Wright, Daniel Turner and Pedro Lladó
Procedural Music Generation Systems in Games
Shangxuan Luo and Joshua Reiss
Psychoacoustics of Machine Learning Amp Emulation Plugins
Mario Vallejo, Michael McLoughlin and Gavin Kearney
Simulating 3D Acoustic Radiation and Scattering in the Frequency Domain with Fourier Neural Operators (FNOs)
James Hipperson, Jonathan A. Hargreaves and Trevor J. Cox
Sound Matching an Analogue Levelling Amplifier Using the Newton-Raphson Method
Chin-Yun Yu and George Fazekas
Supervised Machine Learning for Near-Field Microphone Position Recovery
Gregg O’Donnell
Transfer Learning for Neural Modelling of Nonlinear Distortion Effects
Tara Vanhatalo, Pierrick Legrand, Myriam Desainte-Catherine, Pierre Hanna, Guillaume Pille, Antoine Brusco and Joshua Reiss
Unstable Audio: Code Bending Text-to-Music Generation
Nick Collins