AES E-Library

Hybrid Approach to Speech Source Separation Depending on the Voicing State

Single-channel speech source separation (SCSSS) is a research field with applications that include hearing aids and security. This research uses a hybrid method for SCSSS, which combines two different approaches based on the voicing state; the algorithm can be used for speech source separation and speech enhancement. The hybrid method combines subspace decomposition for unvoiced speech, and Soft-CASA (Computational Auditory Scene Analysis) for voiced speech. The voiced speech source separation process is an improved version of the conventional CASA system that is optimized by the use of a soft mask. Moreover, the unvoiced speech source separation process relies on an optimized approximation of the speech signal by subspace decomposition in the spectral domain. The new system is evaluated for speech separation outcome, as well as for voicing decision. Despite the challenging acoustic environments that were used for test, the proposed speech separation approach yields on average 58.91 % improvement in signal-to-interference ratio, 12.67 % improvement in signal-to-artifact ratio, 38.91 % improvement in signal-to-distortion ratio, and 45 % improvement in perceived speech quality.

 

Author (s):
Affiliation: (See document for exact affiliation information.)
Publication Date:
Permalink: https://aes2.org/publications/elibrary-page/?id=19877


(227KB)


Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.

Type:
E-Libary location:
16938
Choose your country of residence from this list:










Skip to content