Home / Publications / E-library page
Only AES members and Institutional Journal Subscribers can download
Label-conditioned source separation extracts the target source, specified by an input symbol, from an input mixture track. A recently proposed label-conditioned source separation model called Latent Source Attentive Frequency Transformation (LaSAFT)--Gated Point-Wise Convolutional Modulation (GPoCM)--Net introduced a block for latent source analysis called LaSAFT. Employing LaSAFT blocks, it established state-of-the-art performance on several tasks of the MUSDB18 benchmark. This paper enhances the LaSAFT block by exploiting a self-conditioning method. Whereas the existing method only cares about the symbolic relationships between the target source symbol and latent sources, ignoring audio content, the new approach also considers audio content. The enhanced block computes the attention mask conditioning on the label and the input audio feature map. Here, it is shown that the conditioned U-Net employing the enhanced LaSAFT blocks outperforms the previous model. It is also shown that the present model performs the audio-query--based separation with a slight modification.
Author (s): Choi, Woosung; Jeong, Yeong-Seok; Kim, Jinsung; Chung, Jaehwa; Jung, Soonyoung; Reiss, Joshua D.
Affiliation:
Department of Computer Science and Engineering, Korea University, Republic of Korea; Department of Computer Science and Engineering, Korea University, Republic of Korea; Department of Computer Science and Engineering, Korea University, Republic of Korea; Department of Computer Science, Korea National Open University, Republic of Korea; Department of Computer Science and Engineering, Korea University, Republic of Korea; Centre for Digital Music, Queen Mary University of London, London, UK
(See document for exact affiliation information.)
Publication Date:
2022-09-06
Import into BibTeX
Permalink: https://aes2.org/publications/elibrary-page/?id=21880
(904KB)
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.
Choi, Woosung; Jeong, Yeong-Seok; Kim, Jinsung; Chung, Jaehwa; Jung, Soonyoung; Reiss, Joshua D.; 2022; Conditioned Source Separation by Attentively Aggregating Frequency Transformations With Self-Conditioning [PDF]; Department of Computer Science and Engineering, Korea University, Republic of Korea; Department of Computer Science and Engineering, Korea University, Republic of Korea; Department of Computer Science and Engineering, Korea University, Republic of Korea; Department of Computer Science, Korea National Open University, Republic of Korea; Department of Computer Science and Engineering, Korea University, Republic of Korea; Centre for Digital Music, Queen Mary University of London, London, UK; Paper ; Available from: https://aes2.org/publications/elibrary-page/?id=21880
Choi, Woosung; Jeong, Yeong-Seok; Kim, Jinsung; Chung, Jaehwa; Jung, Soonyoung; Reiss, Joshua D.; Conditioned Source Separation by Attentively Aggregating Frequency Transformations With Self-Conditioning [PDF]; Department of Computer Science and Engineering, Korea University, Republic of Korea; Department of Computer Science and Engineering, Korea University, Republic of Korea; Department of Computer Science and Engineering, Korea University, Republic of Korea; Department of Computer Science, Korea National Open University, Republic of Korea; Department of Computer Science and Engineering, Korea University, Republic of Korea; Centre for Digital Music, Queen Mary University of London, London, UK; Paper ; 2022 Available: https://aes2.org/publications/elibrary-page/?id=21880
@article{choi2022conditioned,
author={choi woosung and jeong yeong-seok and kim jinsung and chung jaehwa and jung soonyoung and reiss joshua d.},
journal={journal of the audio engineering society},
title={conditioned source separation by attentively aggregating frequency transformations with self-conditioning},
year={2022},
volume={70},
issue={9},
pages={661-673},
month={september},}
TY – paper
TI – Conditioned Source Separation by Attentively Aggregating Frequency Transformations With Self-Conditioning
SP – 661 EP – 673
AU – Choi, Woosung
AU – Jeong, Yeong-Seok
AU – Kim, Jinsung
AU – Chung, Jaehwa
AU – Jung, Soonyoung
AU – Reiss, Joshua D.
PY – 2022
JO – Journal of the Audio Engineering Society
VO – 70
IS – 9
Y1 – September 2022