Technical Program

Paper Detail

Paper:	SLP-P19.3
Session:	Model-based Robust Speech Recognition
Time:	Friday, May 19, 10:00 - 12:00
Presentation:	Poster
Topic:	Speech and Spoken Language Processing: Model-based robust Speech Recognition
Title:	Unsupervised Online Adaptation of Segmental Switching Linear Gaussian Hidden Markov Models for Robust Speech Recognition
Authors:	Qiang Huo, University of Hong Kong, Hong Kong SAR of China; Donglai Zhu, Institute for Infocomm Research, Singapore; Jian Wu, Microsoft Corporation, United States
Abstract:	In our previous works, a Segmental Switching Linear Gaussian Hidden Markov Model (SSLGHMM) was proposed to model ``noisy" speech utterance for robust speech recognition. Both ML (maximum likelihood) and MCE (minimum classification error) training procedures were developed for training model parameters and their effectiveness was confirmed by evaluation experiments on Aurora2 and Aurora3 databases. In this paper, we present an ML approach to unsupervised online adaptation (OLA) of SSLGHMM parameters for achieving further performance improvement. An important implementation issue of how to initialize the switching linear Gaussian model parameters is also studied. Evaluation results on Finnish Aurora3 database show that in comparison with the performance of a baseline system based on ML-trained SSLGHMMs, unsupervised OLA yields a relative word error rate reduction of 4.3%, 9.1%, and 17.8% for well-matched, medium-mismatched, and high-mismatched conditions respectively.