E-library page

AES E-Library

Crowdsourcing Audio Semantics by Means of Hybrid Bimodal Segmentation with Hierarchical Classification

The task of general audio detection and segmentation is quite common in contemporary audio applications where computationally intensive processes are frequently involved. Machine learning is usually employed along with user-enabled data labeling that is intended to detect, segment, and semantically annotate the relevant audio events. This work focuses on a generic audio detection and classification method that combines hierarchical bimodal segmentation with hybrid pattern classification at different temporal resolutions. This paper presents the algorithmic perspective of a mobile back-end system to facilitate the construction, validation, and continuous update of generic audio ground-truth data. The goal is the implementation of a system that is capable of performing well in different conditions without relying on complicated pattern recognition systems and taxonomies. For this reason, minimal prior knowledge is necessary so that there is consistent behavior for different input signals and computational environments. Novel “classification confidence” metrics are implemented.

Author (s): Vrysis, Lazaros; Tsipas, Nikolaos; Dimoulas, Charalampos; Papanikolaou, George
Affiliation: Aristotle University of Thessaloniki, Thessaloniki, Greece (See document for exact affiliation information.)
Publication Date: 2016-12-06 Import into BibTeX
Permalink: https://aes2.org/publications/elibrary-page/?id=18537

(224KB)

This paper costs $33 for non-members and is free for AES members and E-Libary subscribers.

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.

Type: Journal Article
E-Libary location: (CD JAES64) TMP/JAES64/12/

Learn more about the AES E-Library

About AES

Code of Conduct

AES Conventions

AES Conferences

AES Training & Development

Gift Membership

AES Membership Benefits

Gift Membership

AES Membership Benefits

Become a Sustaining Member

AES Membership Benefits

AES Inside Track

Current Standards

Standards Blog

Journal of the AES

AES E-library

Special Publications

AES Sections are active around the world and provide a means for members to meet locally.

AES Student Website

AES Educational Foundation

Student Sections

See the committee’s accomplishments in diversity & inclusion

AES Statement of solidarity

AES E-Library

Crowdsourcing Audio Semantics by Means of Hybrid Bimodal Segmentation with Hierarchical Classification

Choose your country of residence from this list:

AES E-Library

Login Institutions

Crowdsourcing Audio Semantics by Means of Hybrid Bimodal Segmentation with Hierarchical Classification

Choose your country of residence from this list: