Paper: | SLP-L12.1 |
Session: | Discriminative Training |
Time: | Friday, May 19, 14:00 - 14:20 |
Presentation: |
Lecture
|
Topic: |
Speech and Spoken Language Processing: Discriminative Training Methods |
Title: |
Large Margin Gaussian Mixture Modeling for Phonetic Classification and Recognition |
Authors: |
Fei Sha, Lawrence Saul, University of Pennsylvania, United States |
Abstract: |
We develop a framework for large margin classification by Gaussian mixture models (GMMs). Large margin GMMs have many parallels to support vector machines (SVMs), but with classes modeled by ellipsoids instead of half-spaces. Model parameters are trained discriminatively to maximize the margin of correct classification, as measured in terms of Mahalanobis distances. The required optimization is convex over the model's parameter space of positive semidefinite matrices and can be performed efficiently. Large margin GMMs are naturally suited to large problems in multiway classification; we apply them to phonetic classification and recognition on the TIMIT database. On both tasks, we obtain significant improvement over baseline systems trained by maximum likelihood estimation. For the problem of phonetic classification, our results are competitive with other state-of-the-art classifiers, such as hidden conditional random fields. |