AES E-Library

Audiovisual Congruence and Localization Performance in Virtual Reality: 3D Loudspeaker Model vs. Human Avatar

This paper investigates audiovisual congruence in virtual reality with both horizontal and vertical offsets between audio and visual rendering. Audiovisual congruence and localization errors are assessed using loudspeaker playback and nonindividualized headphone rendering. To account for the influence of different types of visual information on congruence, presentations of a loudspeaker model and 3D human avatar were compared. Therefore, a new dataset of audiovisual speech was recorded. Results show that human avatar rendering increases perceived congruence, and experienced listeners have an increased tendency to respond with “incongruent” when a loudspeaker model is shown but not when the human avatar is presented. Moreover, a correlation is found between localization precision and audiovisual congruence for horizontally offset stimuli and avatar presentation. For vertical offsets, the angular range of congruence is generally large, and localization errors are high, so no correlation can be observed between the two. The paper contributes congruence ranges for audiovisual speech in virtual reality, which also has implications for augmented reality telepresence use.

 

Author (s):
Affiliation: (See document for exact affiliation information.)
Publication Date:
Permalink: https://aes2.org/publications/elibrary-page/?id=22771


(935KB)


Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.

Type:
E-Libary location:
16938
Choose your country of residence from this list:










Skip to content