You are currently logged in as an
Institutional Subscriber.
If you would like to logout,
please click on the button below.
Home / Publications / E-library page
Only AES members and Institutional Journal Subscribers can download
The timbral analysis from spectrographic features of popular music sub-genres (or micro-genres) presents unique challenges to the field of the computational auditory scene analysis, which is caused by the adjacencies among sub-genres and the complex sonic scenes from sophisticated musical textures and production processes. This paper presents a timbral modeling tool based on a modified deep learning natural language processing model. It treats the time frames in spectrograms as words in natural languages to explore the temporal dependencies. The modeling performance metrics obtained from the fine-tuned classifier of the modified Deep Bidirectional Encoder Representations from Transformers (BERT) model show strong semantic modeling performances with different temporal settings. Designed as an automatic feature engineering tool, the proposed framework provides a unique solution to the semantic modeling and representation tasks for objectively understanding of subtle musical timbral patterns from highly similar musical genres.
Author (s): Geng, Shijia; Ren, Gang; Pan, Xu; Zysman, Joel; Ogihara, Mitsu
Affiliation:
University of Miami, FL, USA
(See document for exact affiliation information.)
AES Convention: 150
Paper Number:10470
Publication Date:
2021-05-06
Import into BibTeX
Session subject:
Music Analysis
Permalink: https://aes2.org/publications/elibrary-page/?id=21063
(1597KB)
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.
Geng, Shijia; Ren, Gang; Pan, Xu; Zysman, Joel; Ogihara, Mitsu; 2021; Sequential Modeling of Temporal Timbre Series for Popular Music Sub-Genre Analyses Using Deep Bidirectional Encoder Representations from Transformers [PDF]; University of Miami, FL, USA; Paper 10470; Available from: https://aes2.org/publications/elibrary-page/?id=21063
Geng, Shijia; Ren, Gang; Pan, Xu; Zysman, Joel; Ogihara, Mitsu; Sequential Modeling of Temporal Timbre Series for Popular Music Sub-Genre Analyses Using Deep Bidirectional Encoder Representations from Transformers [PDF]; University of Miami, FL, USA; Paper 10470; 2021 Available: https://aes2.org/publications/elibrary-page/?id=21063
@article{geng2021sequential,
author={geng shijia and ren gang and pan xu and zysman joel and ogihara mitsu},
journal={journal of the audio engineering society},
title={sequential modeling of temporal timbre series for popular music sub-genre analyses using deep bidirectional encoder representations from transformers},
year={2021},
number={10470},
month={may},}
TY – paper
TI – Sequential Modeling of Temporal Timbre Series for Popular Music Sub-Genre Analyses Using Deep Bidirectional Encoder Representations from Transformers
AU – Geng, Shijia
AU – Ren, Gang
AU – Pan, Xu
AU – Zysman, Joel
AU – Ogihara, Mitsu
PY – 2021
JO – Journal of the Audio Engineering Society
VL – 10470
Y1 – May 2021