Technical Program

Paper Detail

Paper:	SLP-P19.5
Session:	Model-based Robust Speech Recognition
Time:	Friday, May 19, 10:00 - 12:00
Presentation:	Poster
Topic:	Speech and Spoken Language Processing: Model-based robust Speech Recognition
Title:	Model Adaptation for Long Convolutional Distortion by Maximum Likelihood Based State Filtering Approach
Authors:	Chandra Kant Raut, Takuya Nishimoto, Shigeki Sagayama, University of Tokyo, Japan
Abstract:	In environment with considerably long reverberation time, each frame of speech is affected by energy components from the preceding frames. Therefore, to adapt parameters of a state of HMM, it becomes necessary to consider these frames, and compute their contributions to current state. However, these speech frames preceding to a state of HMM are not known during adaptation of the models. In this paper, we propose to use preceding states as units of preceding speech segments, estimate their contributions to current state in maximum likelihood manner, and adapt models by accounting their contributions. When clean models were adapted by proposed method for a speaker-dependent isolated word recognition task, word accuracy of the system typically increased from 67.6% to 83.2%, and from 44.8% to 72.5%, for channel distorted speech simulated by linear convolution of clean speech and impulse responses with reverberation time (T_60) of 310 ms and 780 ms, respectively.