AES E-Library

Categorization of Broadcast Audio Objects in Complex Auditory Scenes

Because object-based audio is becoming an important framework for the representation of complex sound scenes, this research describes a series of experiments to determine a categorization framework for broadcast audio objects. Categorization is a fundamental human strategy for reducing cognitive load, and knowledge of these categories should be beneficial for the development of perceptually based representations and rendering strategies for object-based audio. In this study, 21 expert and non-expert listeners took part in a free card sorting task using audio objects from a variety of different types of program material. Hierarchical agglomerative clustering suggests that there are 7 general categories, which relate to sounds indicating actions and movement, continuous background sound, transient background sound, clear speech, non-diegetic music and effects, sounds indicating the presence of people, and prominent attention-grabbing transient sounds. A three-dimensional perceptual space calculated via multidimensional scaling suggests that these categories vary along the dimensions of semantic content, continuous-transient, and presence-absence of people. The position of an audio object along the dimensions of the perceptual space relates to its perceived importance.

 

Author (s):
Affiliation: (See document for exact affiliation information.)
Publication Date:
Permalink: https://aes2.org/publications/elibrary-page/?id=18297


(610KB)


Download Now

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.

Type:
E-Libary location:
16938
Choose your country of residence from this list:










Skip to content