Home / Publications / E-library page
Only AES members and Institutional Journal Subscribers can download
This study examines listeners’ natural ability to identify an anonymous speaker’s emotions from speech samples with broad ranges of emotional intensity. This study aims to compare emotional ratings between posed and spontaneous speech samples and analyzes how basic acoustic parameters are utilized. The spontaneous samples were extracted from the Korean Spontaneous Speech corpus consisting of casual conversations. The posed samples with emotions (happiness, neutrality, anger, sadness) were obtained from the Emotion Classification dataset. Non-native listeners were asked to evaluate seven opposite pairs of affective attributes perceived from the speech samples. Listeners perceived fewer spontaneous samples as having negative valences. The posed samples had higher mean rating scores than those of the spontaneous speeches, only in negative valences. Listeners reacted more sensitively to the posed than spontaneous speeches in negative valence and had difficulty detecting happiness from the posed samples. The spontaneous samples perceived as positive had higher variance in pitch and higher maximum pitch than those perceived as negative. Contrastingly, the posed samples perceived as negative valences were positively correlated with higher values of the pitch parameters. These results can be utilized to assign specific vocal affects to artificial intelligence voice agents or virtual humans, rendering more human-like voices.
Author (s): Oh, Eunmi; Suhr, Jinsun
Affiliation:
Yonsei University; Yonsei University
(See document for exact affiliation information.)
AES Convention: 155
Paper Number:10671
Publication Date:
2023-10-06
Import into BibTeX
Session subject:
Signal Processing
Permalink: https://aes2.org/publications/elibrary-page/?id=22252
(3947KB)
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.
Oh, Eunmi; Suhr, Jinsun; 2023; Vocal Affects Perceived from Spontaneous and Posed Speech [PDF]; Yonsei University; Yonsei University; Paper 10671; Available from: https://aes2.org/publications/elibrary-page/?id=22252
Oh, Eunmi; Suhr, Jinsun; Vocal Affects Perceived from Spontaneous and Posed Speech [PDF]; Yonsei University; Yonsei University; Paper 10671; 2023 Available: https://aes2.org/publications/elibrary-page/?id=22252