Home / Publications / E-library page
Only AES members and Institutional Journal Subscribers can download
In this paper we propose a method of extending monaural into stereophonic sound based on deep neural networks (DNNs). First, it is assumed that monaural signals are the mid signals for the extended stereo signals. In addition, the residual signals are obtained by performing the linear prediction (LP) analysis. The LP coefficients of monaural signals are converted into the line spectral frequency (LSF) coefficients. After that, the LSF coefficients are taken as the DNN features, and the features of the side signals are estimated from those of the mid signals. The performance of the proposed method is evaluated using a log spectral distortion (LSD) measure and a multiple stimuli with a hidden reference and anchor (MUSHRA) test. It is shown from the performance comparison that the proposed method provides lower LSD and higher MUSHRA score than a conventional method using hidden Markov model (HMM).
Author (s): Chun, Chan Jun; Jeong, Seok Hee; Park, Su Yeon; Kim, Hong Kook
Affiliation:
Gwangju Institute of Science and Technology (GIST), Gwangju, Korea; City University of New York, New York, NY, USA
(See document for exact affiliation information.)
AES Convention: 139
Paper Number:9400
Publication Date:
2015-10-06
Import into BibTeX
Session subject:
Signal Processing
Permalink: https://aes2.org/publications/elibrary-page/?id=17957
(701KB)
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.
Chun, Chan Jun; Jeong, Seok Hee; Park, Su Yeon; Kim, Hong Kook; 2015; Extension of Monaural to Stereophonic Sound Based on Deep Neural Networks [PDF]; Gwangju Institute of Science and Technology (GIST), Gwangju, Korea; City University of New York, New York, NY, USA; Paper 9400; Available from: https://aes2.org/publications/elibrary-page/?id=17957
Chun, Chan Jun; Jeong, Seok Hee; Park, Su Yeon; Kim, Hong Kook; Extension of Monaural to Stereophonic Sound Based on Deep Neural Networks [PDF]; Gwangju Institute of Science and Technology (GIST), Gwangju, Korea; City University of New York, New York, NY, USA; Paper 9400; 2015 Available: https://aes2.org/publications/elibrary-page/?id=17957
@article{chun2015extension,
author={chun chan jun and jeong seok hee and park su yeon and kim hong kook},
journal={journal of the audio engineering society},
title={extension of monaural to stereophonic sound based on deep neural networks},
year={2015},
number={9400},
month={october},}