AES E-Library

Semantic Audio: Machines Get Clever with Music

[Feature] The growth in computer power over the past decade has enabled remarkable possibilities for the automatic interpretation of audio signals. As human listeners we are able to make all sorts of conscious and unconscious interpretations of what we hear, from the recognition of instruments and voices within a complex texture through the extraction of melodic and chordal progressions to the inference of emotional mood or cultural associations. All of this is based on listening to a single mixed stream of sound that is just a messy waveform. If we are lucky there may be some spatial information involving the reception of more than one related stream from different directions, but at best we only have two ears no matter how many sources there are. Enabling machines to make sense of mixed audio streams was something close to the realms of science fiction not so long ago, but the latest research in semantic audio analysis brings it within our grasp.

 

Author (s):
Affiliation: (See document for exact affiliation information.)
Publication Date:
Permalink: https://aes2.org/publications/elibrary-page/?id=16154


(1344KB)


Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.

Type:
E-Libary location:
16938
Choose your country of residence from this list:










Skip to content