You are currently logged in as an
Institutional Subscriber.
If you would like to logout,
please click on the button below.
Home / Publications / E-library page
Only AES members and Institutional Journal Subscribers can download
Voice activity detection (VAD) is a critical part of some speech processing because a processing algorithm needs to distinguish between real voices and other unrelated background sounds. This report explores the combination of a neural network and dual microphones to improve VAD estimates in handset applications. Two new features are extracted from the dual microphones: subband signed power difference (SBSPD) and inter-microphone cross correlation (IMCC). SBSPD provides specific and accurate power difference information at various frequency bands and IMCC contains detailed spatial location information of both microphones. Extensive objective evaluation has been performed under various noise conditions including directional speech interference. Compared to existing methods based on the power level difference ratio, the proposed method is superior in terms of accuracy and robustness of VAD estimate under various noise environments, especially directional speech interferences. Because the method adapts to the sonic environment, parameter optimization is not needed and the approach is suitable for hand-held devices.
Author (s): Zhang, LuoFei; Zhang, Ming; Li, Chen
Affiliation:
Jiangsu Audio Engineering Lab, School of Physics and Technology, Nanjing Normal University, Nanjing, China
(See document for exact affiliation information.)
Publication Date:
2015-12-06
Import into BibTeX
Permalink: https://aes2.org/publications/elibrary-page/?id=18059
(528KB)
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.
Zhang, LuoFei; Zhang, Ming; Li, Chen; 2015; Dual-Microphone Voice Activity Detection Estimate in Handset Applications Based on Neural Network by Using Subband Signed Power Difference and Inter-Microphone Cross Correlation [PDF]; Jiangsu Audio Engineering Lab, School of Physics and Technology, Nanjing Normal University, Nanjing, China; Paper ; Available from: https://aes2.org/publications/elibrary-page/?id=18059
Zhang, LuoFei; Zhang, Ming; Li, Chen; Dual-Microphone Voice Activity Detection Estimate in Handset Applications Based on Neural Network by Using Subband Signed Power Difference and Inter-Microphone Cross Correlation [PDF]; Jiangsu Audio Engineering Lab, School of Physics and Technology, Nanjing Normal University, Nanjing, China; Paper ; 2015 Available: https://aes2.org/publications/elibrary-page/?id=18059
@article{zhang2015dual-microphone,
author={zhang luofei and zhang ming and li chen},
journal={journal of the audio engineering society},
title={dual-microphone voice activity detection estimate in handset applications based on neural network by using subband signed power difference and inter-microphone cross correlation},
year={2015},
volume={63},
issue={12},
pages={1017-1024},
month={december},}
TY – paper
TI – Dual-Microphone Voice Activity Detection Estimate in Handset Applications Based on Neural Network by Using Subband Signed Power Difference and Inter-Microphone Cross Correlation
SP – 1017 EP – 1024
AU – Zhang, LuoFei
AU – Zhang, Ming
AU – Li, Chen
PY – 2015
JO – Journal of the Audio Engineering Society
VO – 63
IS – 12
Y1 – December 2015