ICASSP 2006 - May 15-19, 2006 - Toulouse, France

Technical Program

Paper Detail

Paper:SLP-P13.4
Session:Speech Synthesis III
Time:Thursday, May 18, 10:00 - 12:00
Presentation: Poster
Topic: Speech and Spoken Language Processing: Evaluation metrics
Title: Perceptual Distortion Analysis and Quality Estimation of Prosody-Modified Speech for TD-PSOLA
Authors: Shi-Han Chen, Shun-Ju Chen, Chih-Chung Kuo, Industrial Technology Research Institute, Taiwan
Abstract: TD-PSOLA is one of the most widely used prosodic modification techniques. However, perceptible distortions are introduced occasionally and how TD-PSOLA affects speech quality has not been fully understood and controlled. In this paper, we present a quality estimation method before performing modification. By exploiting relationship between prosodic modifications and subjective scores, 27 distance measures are proposed and respective performances are given and compared. Extensive search is used to find every possible combination among these measures, and the best correlation between the predicted and subjective scores is 87.6%, which can be obtained by linear regression of 4 proposed distance measures. The proposed method does not require synthesizing target and can be used both in online unit selection and off-line corpus design of TTS systems.



IEEESignal Processing Society

©2018 Conference Management Services, Inc. -||- email: webmaster@icassp2006.org -||- Last updated Friday, August 17, 2012