Paper: | SS-5.2 |
Session: | Dealing with Intrinsic Speech Variabilities in ASR |
Time: | Wednesday, May 17, 14:20 - 14:40 |
Presentation: |
Special Session Lecture
|
Topic: |
Special Sessions: Dealing with intrinsic speech variabilities in ASR |
Title: |
Frequency-Warping Invariant Features for Automatic Speech Recognition |
Authors: |
Alfred Mertins, Jan Rademacher, University of Oldenburg, Germany |
Abstract: |
Based on the well-known relationship between vocal tract length (VTL) variation and linear frequency warping, we present a method for generating vocal tract length invariant (VTLI) features. These features are computed as translation invariant, correlation-type features in a log-frequency domain. In phoneme classification and recognition experiments on the TIMIT database, their discrimination capabilities and robustness to mismatches between training and test conditions turned out to be considerably better than for Mel-frequency cepstral coefficients (MFCCs). The best results are obtained when VTLI features and MFCCs are combined. |