ICASSP 2006 - May 15-19, 2006 - Toulouse, France

Technical Program

Paper Detail

Paper:SLP-L10.3
Session:Speaker Adaptation
Time:Friday, May 19, 10:40 - 11:00
Presentation: Lecture
Topic: Speech and Spoken Language Processing: Speaker adaptation and normalization (e.g., VTLN)
Title: A Non-linear Speaker Adaptation Technique Using Kernel Ridge Regression
Authors: George Saon, IBM, United States
Abstract: We propose a non-linear model space transformation for speaker or environment adaptation based on weighted kernel ridge regression (KRR). The transformation is given by a generalized least squares linear regression in a kernel-induced feature space operating on Gaussian mixture model means and having as targets the adaptation frames. Using the ``kernel trick'', the solution to the optimization problem is obtained by solving a system of linear equations involving the Gram matrix of the input variables. We show that MLLR is a special case of KRR when a linear kernel is employed. Furthermore, we study an efficient low-rank approximation to the kernel matrix termed ``rectangle method'', where the regressors are chosen to be a small set of clustered adaptation frames. Experiments conducted on the EARS database (English conversational telephone speech) indicate that KRR with a Gaussian RBF kernel outperforms standard regression class-based MLLR.



IEEESignal Processing Society

©2018 Conference Management Services, Inc. -||- email: webmaster@icassp2006.org -||- Last updated Friday, August 17, 2012