Technical Program

Paper Detail

Paper:	SLP-L12.1
Session:	Discriminative Training
Time:	Friday, May 19, 14:00 - 14:20
Presentation:	Lecture
Topic:	Speech and Spoken Language Processing: Discriminative Training Methods
Title:	Large Margin Gaussian Mixture Modeling for Phonetic Classification and Recognition
Authors:	Fei Sha, Lawrence Saul, University of Pennsylvania, United States
Abstract:	We develop a framework for large margin classification by Gaussian mixture models (GMMs). Large margin GMMs have many parallels to support vector machines (SVMs), but with classes modeled by ellipsoids instead of half-spaces. Model parameters are trained discriminatively to maximize the margin of correct classification, as measured in terms of Mahalanobis distances. The required optimization is convex over the model's parameter space of positive semidefinite matrices and can be performed efficiently. Large margin GMMs are naturally suited to large problems in multiway classification; we apply them to phonetic classification and recognition on the TIMIT database. On both tasks, we obtain significant improvement over baseline systems trained by maximum likelihood estimation. For the problem of phonetic classification, our results are competitive with other state-of-the-art classifiers, such as hidden conditional random fields.