AES E-Library

Efficient data collection pipeline for audio machine learning of audio quality

In this paper we study the matter of perceptual evaluation data collection for the purposes of machine learning. Well established listening test methods have been developed and standardised in the audio community over many years. This papers looks at the specific needs for machine learning and seeks to establish efficient data collection methods, that address the requirements of machine learning, whilst also providing robust and repeatable perceptual evaluation results. Following a short review of efficient data collection techniques, including the concept of data augmentation and introduce the new concept of pre-augmentation as an alternative efficient data collection approach. Multiple stimulus presentation style listening tests are then presented for the evaluation of a wide range of audio quality devices (headphones) evaluated by a panel of trained expert assessors. Two tests are presented using a traditional full factorial design and a pre-augmented design to enable the performance comparison of these two approaches. The two approaches are statistically analysed and discussed. Finally, the performance of the two approaches for building machine learning models are reviewed, comparing the performance of a range of baseline models.

 

Author (s):
Affiliation: (See document for exact affiliation information.)
AES Convention: Paper Number:
Publication Date:
Session subject:
Permalink: https://aes2.org/publications/elibrary-page/?id=21081


(795KB)


Download Now

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.

Type:
E-Libary location:
16938
Choose your country of residence from this list:










Skip to content