ICASSP 2006 - May 15-19, 2006 - Toulouse, France

Technical Program

Paper Detail

Paper:SLP-P10.3
Session:Speech Synthesis II
Time:Wednesday, May 17, 14:00 - 16:00
Presentation: Poster
Topic: Speech and Spoken Language Processing: Text-to-phoneme conversion
Title: IDENTIFYING LANGUAGE ORIGIN OF PERSON NAMES WITH N-GRAMS OF DIFFERENT UNITS
Authors: Yining Chen, Microsoft Research Asia, China; Jiali You, Chinese Academy of Sciences, China; Min Chu, Yong Zhao, Microsoft Research Asia, China; Jinlin Wang, Chinese Academy of Sciences, China
Abstract: Identifying the language origin of a name appeared in English is important for generating correct pronunciation of the name. In this paper, N-grams of syllable-based letter clusters are proposed for the task. The performance of the N-gram model of a set of frequently used letter clusters (correspond to syllables) is compared to that of letter N-gram model in a four language task (English, German, French and Portuguese). On average, the letter cluster N-gram that has 26% error rate, is slightly better than the letter N-gram that has 27.2% error rate. Furthermore, it is found that the error distributions from the two N-grams have pretty large differences. Therefore, AdaBoost is used to combine the results from N-grams of different units. The error rate is reduced to 22.5% or a relative 17.5% error reduction is achieved after the combination.



IEEESignal Processing Society

©2018 Conference Management Services, Inc. -||- email: webmaster@icassp2006.org -||- Last updated Friday, August 17, 2012