With the omnipresence of digital multimedia data, the processing, analysis, and understanding of such data by means of automated methods has become a central issue in engineering and computer science.
Semantic audio is concerned with:
-
- analysing audio signals in order to infer semantically meaningful information that can be understood by humans
- decomposing audio signals into semantic entities in order to enable facilitated handling, modification and interaction with these audio objects in an intuitive way
- enabling a machine to process audio signal as human experts could do (a least for the simple and boring tasks)
Such methods are relevant for the following applications:
- analysing music for automated recommendation services
- automatic transcription, score following and source separation for personalised sound and interactive music education
- managing large amounts of data in audio editing and production
- new consumer applications including DJ, karaoke, and dialog enhancement software
Deep Learning is also omnipresent. It is a branch of machine learning that in recent years gave rise to developments that outperformed their predecessors by large margins. This happened in computer vision and natural language processing and then also in digital speech and audio signal processing, e.g. in speech recognition, speech synthesis, speech enhancement, dereverberation and blind source separation.