Home / Publications / E-library page
Only AES members and Institutional Journal Subscribers can download
Automatic music transcription with note level output is a current task in the field of music information retrieval. In contrast to the piano case with very good results using available large datasets, transcription of non-professional singing has been rarely investigated with deep learning approaches because of the lack of note level annotated datasets. In this work, two datasets are created concerning amateur singing recordings, one for training (synthetic singing dataset) and one for the evaluation task (SingReal dataset). The synthetic training dataset is generated by synthesizing a large scale of vocal melodies from artificial songs. Because the evaluation should represent a realistic scenario, the SingReal dataset is created from real recordings of non-professional singers. To transcribe singing notes, a new method called Dual Task Monophonic Singing Transcription is proposed, which divides the problem of singing transcription into the two subtasks onset detection and pitch estimation, realized by two small independent neural networks. This approach achieves a note level F1 score of 74.19% on the SingReal dataset, outperforming all state of the art transcription systems investigated with at least 3.5% improvement. Furthermore, Dual Task Monophonic Singing Transcription can be adapted very easily to the real-time transcription case.
Author (s): Schwabe, Markus; Murgul, Sebastian; Heizmann, Michael
Affiliation:
Institute of Industrial Information Technology (IIIT), Karlsruhe Institute of Technology, Karlsruhe, Germany
(See document for exact affiliation information.)
Publication Date:
2022-12-06
Import into BibTeX
Permalink: https://aes2.org/publications/elibrary-page/?id=22025
(457KB)
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.
Schwabe, Markus; Murgul, Sebastian; Heizmann, Michael; 2022; Dual Task Monophonic Singing Transcription [PDF]; Institute of Industrial Information Technology (IIIT), Karlsruhe Institute of Technology, Karlsruhe, Germany; Paper ; Available from: https://aes2.org/publications/elibrary-page/?id=22025
Schwabe, Markus; Murgul, Sebastian; Heizmann, Michael; Dual Task Monophonic Singing Transcription [PDF]; Institute of Industrial Information Technology (IIIT), Karlsruhe Institute of Technology, Karlsruhe, Germany; Paper ; 2022 Available: https://aes2.org/publications/elibrary-page/?id=22025