Technical Program

Paper Detail

Paper:	SLP-P16.4
Session:	Speaker Tracking and Adaptation
Time:	Thursday, May 18, 16:30 - 18:30
Presentation:	Poster
Topic:	Speech and Spoken Language Processing: Speaker adaptation and normalization (e.g., VTLN)
Title:	FAST SPEAKER ADAPTION VIA MAXIMUM PENALIZED LIKELIHOOD KERNEL REGRESSION
Authors:	Ivor W, Tsang, James T. Kwok, Brian Mak, Kai Zhang, Jeffrey J. Pan, Hong Kong University of Science and Technology, Hong Kong SAR of China
Abstract:	Maximum likelihood linear regression (MLLR) has been a popular speaker adaptation method for many years. In this paper, we investigate a generalization of MLLR using nonlinear regression. Specifically, kernel regression is applied with appropriate regularization to determine the transformation matrix in MLLR for fast speaker adaptation. The proposed method, called maximum penalized likelihood kernel regression adaptation (MPLKR), is computationally simple and the mean vectors of the speaker adapted acoustic model can be obtained analytically by simply solving a linear system. Since no nonlinear optimization is involved, the obtained solution is always guaranteed to be globally optimal. The new adaptation method was evaluated on the Resource Management task with 5s and 10s of adaptation speech. Results show that MPLKR outperforms the standard MLLR method.