Technical Program

Paper Detail

Paper:	SLP-P13.11
Session:	Speech Synthesis III
Time:	Thursday, May 18, 10:00 - 12:00
Presentation:	Poster
Topic:	Speech and Spoken Language Processing: Tools and data for speech synthesis
Title:	Database Pruning for Unsupervised Building of Text-to-Speech Voices
Authors:	Jordi Adell, Pablo Daniel Agüero, Antonio Bonafonte, Universitat Politècnica de Catalunya (UPC), Spain
Abstract:	Unit Selection speech synthesis techniques lead the speech synthesis state of the art. Automatic segmentation of databases is necessary in order to build new voices. They may contain errors and segmentation processes may introduce some more. Quality systems require a significant effort to find and correct these segmentation errors. Phonetic transcription is crucial and is one of the manually supervised tasks. The possibility to automatically remove incorrectly transcribed units from the inventory will help to make the process more automatic. Here we present a new technique based on speech recognition confidence measures that reaches to remove 90% of incorrectly transcribed units from a database. The cost for it is loosing only a 10% of correctly transcribed units.