You are currently logged in as an
Institutional Subscriber.
If you would like to logout,
please click on the button below.
Home / Publications / E-library page
Only AES members and Institutional Journal Subscribers can download
Exploiting inter-frame redundancies is key to performance enhancement of delay constrained perceptual audio coders. The long term prediction (LTP) tool was introduced in the MPEG Advanced Audio Coding standard, especially for the low delay mode, to capitalize on the periodicity in naturally occurring sounds by identifying a segment of previously reconstructed data as prediction for the current frame. However, speech and vocal content in audio signals is well known to be quasi-periodic and involve small variations in pitch period, which compromise the LTP tool performance. The proposed approach modifies LTP by introducing a single parameter of “geometric” warping, whereby past periodicity is geometrically warped to provide an adjusted prediction for the current samples. We also propose a three-stage parameter estimation technique, where an unwarped LTP filter is first estimated to minimize the mean squared prediction error; then filter parameters are complemented with the warping parameter, and re-estimated within a small neighboring search space to retain the set of S best LTP parameters; and finally, a perceptual distortion-rate procedure is used to select from the S candidates, the parameter set that minimizes the perceptual distortion. Objective and subjective evaluations substantiate the proposed technique’s effectiveness.
Author (s): Nanjundaswamy, Tejaswi; Rose, Kenneth
Affiliation:
University of California, Santa Barbara, Santa Barbara, CA, USA
(See document for exact affiliation information.)
AES Convention: 133
Paper Number:8767
Publication Date:
2012-10-06
Import into BibTeX
Session subject:
Sound Analysis and Synthesis
Permalink: https://aes2.org/publications/elibrary-page/?id=16509
(672KB)
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.
Nanjundaswamy, Tejaswi; Rose, Kenneth; 2012; On Accommodating Pitch Variation in Long Term Prediction of Speech and Vocals in Audio Coding [PDF]; University of California, Santa Barbara, Santa Barbara, CA, USA; Paper 8767; Available from: https://aes2.org/publications/elibrary-page/?id=16509
Nanjundaswamy, Tejaswi; Rose, Kenneth; On Accommodating Pitch Variation in Long Term Prediction of Speech and Vocals in Audio Coding [PDF]; University of California, Santa Barbara, Santa Barbara, CA, USA; Paper 8767; 2012 Available: https://aes2.org/publications/elibrary-page/?id=16509