Paper: | SLP-L6.3 |
Session: | Advances in LVCSR Algorithms |
Time: | Wednesday, May 17, 17:10 - 17:30 |
Presentation: |
Lecture
|
Topic: |
Speech and Spoken Language Processing: Alternative Statistical and Machine Learning Methods for General ASR (e.g., no-HMM methods) |
Title: |
Augmented Statistical Models for Speech Recognition |
Authors: |
Martin Layton, Mark J. F. Gales, University of Cambridge, United Kingdom |
Abstract: |
Recently there has been significant interest in developing new acoustic models for speech recognition. One such model, that allows complex dependencies to be represented, is the augmented statistical model. This incorporates additional dependencies by constructing a local exponential expansion of a standard HMM. Unfortunately, the resulting model often has an intractable normalisation term, rendering training difficult for all but binary classification tasks. In this paper, conditional augmented (C-Aug) models are proposed as an attractive alternative. Instead of modelling utterance likelihoods and inferring decision boundaries, C-Aug models directly model the posterior probability of class labels, conditioned on the utterance. The resulting model is easy to normalise and can be trained using conditional maximum likelihood estimation. In addition, as a convex model, the optimisation converges to a global maximum. |