You are currently logged in as an
Institutional Subscriber.
If you would like to logout,
please click on the button below.
Home / Publications / E-library page
Only AES members and Institutional Journal Subscribers can download
Music mixing involves transforming clean, individual tracks into a cohesive final mix using audio effects and expert knowledge. While rule-based and machine learning methods have shown promise, scaling them to real-world situations remains challenging. We propose a two-stage mixing architecture that combines domain knowledge with deep learning, enabling the system to handle over 100 input tracks with high perceptual quality.
The first stage uses a rule-based level balancing system to mix grouped tracks into stems. The second stage employs a differentiable mixing style transfer model guided by a reference mix. To enhance intra-group (within subgroup) robustness, we refine loudness estimation by incorporating spectral centroid and fundamental frequency features, addressing limitations of Loudness Units relative to Full Scale (LUFS) on narrowband signals.
Subjective listening tests demonstrate that our enhanced intra-group mixing approach consistently outperforms LUFS-based baselines across multiple musical genres. Furthermore, our proposed two-step system enables deep learning to successfully handle projects with over 100 tracks for the first time, achieving mixing results that significantly surpass those of traditional rule-based systems. Code and audio examples are available at https://doi.org/10.5281/zenodo.17171082.
Author (s): Shi, Jinjie; Xie, Kunzhu; Ma, Yinghao; Reiss, Joshua
Affiliation:
Queen Mary University of London; Queen Mary University of London; Queen Mary University of London; Wuhan University of Communication
(See document for exact affiliation information.)
AES Convention: 159
Paper Number:10232
Publication Date:
2025-10-14
Import into BibTeX
Permalink: https://aes2.org/publications/elibrary-page/?id=23076
(855KB)
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.

Shi, Jinjie; Xie, Kunzhu; Ma, Yinghao; Reiss, Joshua; 2025; A Scalable Two-Stage Automatic Mixing System Integrating Machine Learning and Domain Knowledge [PDF]; Queen Mary University of London; Queen Mary University of London; Queen Mary University of London; Wuhan University of Communication; Paper 10232; Available from: https://aes2.org/publications/elibrary-page/?id=23076
Shi, Jinjie; Xie, Kunzhu; Ma, Yinghao; Reiss, Joshua; A Scalable Two-Stage Automatic Mixing System Integrating Machine Learning and Domain Knowledge [PDF]; Queen Mary University of London; Queen Mary University of London; Queen Mary University of London; Wuhan University of Communication; Paper 10232; 2025 Available: https://aes2.org/publications/elibrary-page/?id=23076
@article{shi2025a,
author={shi jinjie and xie kunzhu and ma yinghao and reiss joshua},
journal={journal of the audio engineering society},
title={a scalable two-stage automatic mixing system integrating machine learning and domain knowledge},
year={2025},
number={10232},
month={october},}