Technical Program

Paper Detail

Paper:	SLP-L10.1
Session:	Speaker Adaptation
Time:	Friday, May 19, 10:00 - 10:20
Presentation:	Lecture
Topic:	Speech and Spoken Language Processing: Speaker adaptation and normalization (e.g., VTLN)
Title:	Incremental Adaptation Using Bayesian Inference
Authors:	Kai Yu, Mark J. F. Gales, University of Cambridge, United Kingdom
Abstract:	Adaptive training is a powerful technique to build system on non-homogeneous training data. Here, a canonical model, representing ``pure'' speech variability and a set of transforms representing unwanted acoustic variabilities are both trained. To use the canonical model for recognition, a transform for the test acoustic condition is required. For some situations a robust estimate of the transform parameters may not be possible due to limited, or no, adaptation data. One solution to this problem is to view adaptive training in a Bayesian framework and marginalise out the transform parameters. Exact implementation of this Bayesian inference is intractable. Recently, lower bound approximations based on variational Bayes have been used to solve this problem for batch adaptation with limited data. This paper extends this Bayesian adaptation framework to incremental adaptation. Various lower-bound approximations and options for propagating information within this incremental framework are discussed. Experiments using adaptive models trained with both maximum likelihood and minimum phone error training are described. Using incremental Bayesian adaptation gains were obtained over the standard approaches, especially for limited data.