| Abstract: | In this paper, we examine the problem of kernel selection for one-versus-all (OVA) classification of multiclass data with support vector machines (SVMs). We focus specifically on the problem of training generalized linear kernels of the form, $k(x,y) = x^{T}Ry$, where $R$ is a positive semidefinite matrix. Our approach for training $k(x,y)$ involves first constructing a set of upper bounds on the rates of false positives and false negatives at a given score threshold. Under various conditions, minimizing these bounds leads to the closed-form solution, $R = W^{-1}$, where $W$ is the expected within-class covariance matrix of the data. We tested various parameterizations of $R$, including a diagonal parameterization that simply performs per-feature variance normalization, on the 1-conversation training condition of the SRE-2003 and SRE-2004 speaker recognition tasks. In experiments on a state-of-the-art MLLR-SVM speaker recognition system \cite{Stolcke}, the parameterization, $R = \hat{W}_{s}^{-1}$, where $\hat{W}_{s}$ is a smoothed estimate of $W$, achieves relative reductions in the minimum decision cost function (DCF) of up to 22\% below the results obtained when $R$ does per-feature variance normalization. |