Paper: | SLP-L2.3 |
Session: | Advances in Robust Speech Recognition |
Time: | Tuesday, May 16, 14:40 - 15:00 |
Presentation: |
Lecture
|
Topic: |
Speech and Spoken Language Processing: Model-based robust Speech Recognition |
Title: |
HIDDEN SEMI-MARKOV MODEL BASED SPEECH RECOGNITION SYSTEM USING WEIGHTED FINITE-STATE TRANSDUCER |
Authors: |
Keiichiro Oura, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda, Nagoya Institute of Technology, Japan |
Abstract: |
In hidden Markov models (HMMs), state duration probabilities decrease exponentially with time. It would be inappropriate representation of temporal structure of speech. One of the solutions for this problem is integrating state duration probability distributions explicitly into the HMM. This form is known as a hidden semi-Markov model (HSMM). Although a number of attempts to use explicit duration models in speech recognition systems have been proposed, they are not consistent because various approximations were used in both training and decoding. In the present paper, a fully consistent speech recognition system based on the HSMM framework is proposed. In a speaker-dependent continuous speech recognition experiment, HSMM-based speech recognition system achieved about 5.9% relative error reduction over the corresponding HMM-based one. |