Paper: | SLP-P1.7 |
Session: | Feature Extraction and Modeling |
Time: | Tuesday, May 16, 10:30 - 12:30 |
Presentation: |
Poster
|
Topic: |
Speech and Spoken Language Processing: Pronunciation Modeling |
Title: |
Automatic Derivation of A Phoneme Set With Tone Information for Chinese Speech Recognition Based on Mutual Information Criterion |
Authors: |
Jin-Song Zhang, Xin-Hui Hu, Satoshi Nakamura, ATR Spoken Language Communication Research Laboratories, Japan |
Abstract: |
An appropriate approach to model tone information is helpful for Chinese speech recognition system. We propose to derive an efficient phoneme set with tone dependencies by iteratively merging a pair of originally tone-dependent units according to the principle of minimal loss of the mutual information, measured between the words and their phoneme transcriptions in a training text corpus using the system lexical and language model. The approach has the capability to keep discriminative tonal (and phoneme) contrasts and merge those unimportant ones. The result enables a flexible selection of phoneme set according to a balance between the MI information and the number of phonemes. We applied the method to the traditional Initial/Final set, and derived several different phoneme sets. Speech recognition experiments using the derived sets showed their effectiveness. |