AES E-Library

Audio Pattern Recognition of Baby Crying Sound Events

Infants can communicate their internal state (such as pain, hunger, fear, fatigue, or stress) by the nature of their crying. Experts in linguistics suggest that the cry comprises the first speech manifestations. This article describes the design methodology for classifying baby crying sound events according to the pathological status of the infant. Such an automated system can be an aid to an attending physician performing a diagnosis. In order to address this challenge, a great variety of audio parameters (Perceptual Linear Prediction, Mel Frequency Cepstral Coefficients, Perceptual Wavelet Packets, Teager Energy Operator, Temporal Modulation) were considered. Classification techniques, including Multilayer Perception, Support Vector Machine, Random Forest, Reservoir Network, Gaussian Mixture model, and Hidden Markov model were customized. The goal is to provide an automatic and noninvasive framework for monitoring infants and helping inexperienced/trainee pediatricians, parents, and baby caregivers to identify the baby’s pathological status.

 

Author (s):
Affiliation: (See document for exact affiliation information.)
Publication Date:
Permalink: https://aes2.org/publications/elibrary-page/?id=17641


(424KB)


Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.

Type:
E-Libary location:
16938
Choose your country of residence from this list:










Skip to content