AES E-Library

A Scalable Two-Stage Automatic Mixing System Integrating Machine Learning and Domain Knowledge

Music mixing involves transforming clean, individual tracks into a cohesive final mix using audio effects and expert knowledge. While rule-based and machine learning methods have shown promise, scaling them to real-world situations remains challenging. We propose a two-stage mixing architecture that combines domain knowledge with deep learning, enabling the system to handle over 100 input tracks with high perceptual quality.
The first stage uses a rule-based level balancing system to mix grouped tracks into stems. The second stage employs a differentiable mixing style transfer model guided by a reference mix. To enhance intra-group (within subgroup) robustness, we refine loudness estimation by incorporating spectral centroid and fundamental frequency features, addressing limitations of Loudness Units relative to Full Scale (LUFS) on narrowband signals.
Subjective listening tests demonstrate that our enhanced intra-group mixing approach consistently outperforms LUFS-based baselines across multiple musical genres. Furthermore, our proposed two-step system enables deep learning to successfully handle projects with over 100 tracks for the first time, achieving mixing results that significantly surpass those of traditional rule-based systems. Code and audio examples are available at https://doi.org/10.5281/zenodo.17171082.

 

Author (s):
Affiliation: (See document for exact affiliation information.)
AES Convention: Paper Number:
Publication Date:
Permalink: https://aes2.org/publications/elibrary-page/?id=23076


(855KB)


Download Now

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.

Type:
E-Libary location:
16938
Choose your country of residence from this list:










Skip to content