Technical Program

Paper Detail

Paper:	SLP-L2.3
Session:	Advances in Robust Speech Recognition
Time:	Tuesday, May 16, 14:40 - 15:00
Presentation:	Lecture
Topic:	Speech and Spoken Language Processing: Model-based robust Speech Recognition
Title:	HIDDEN SEMI-MARKOV MODEL BASED SPEECH RECOGNITION SYSTEM USING WEIGHTED FINITE-STATE TRANSDUCER
Authors:	Keiichiro Oura, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda, Nagoya Institute of Technology, Japan
Abstract:	In hidden Markov models (HMMs), state duration probabilities decrease exponentially with time. It would be inappropriate representation of temporal structure of speech. One of the solutions for this problem is integrating state duration probability distributions explicitly into the HMM. This form is known as a hidden semi-Markov model (HSMM). Although a number of attempts to use explicit duration models in speech recognition systems have been proposed, they are not consistent because various approximations were used in both training and decoding. In the present paper, a fully consistent speech recognition system based on the HSMM framework is proposed. In a speaker-dependent continuous speech recognition experiment, HSMM-based speech recognition system achieved about 5.9% relative error reduction over the corresponding HMM-based one.