Technical Program

Paper Detail

Paper:	SLP-L13.3
Session:	Missing Data Methods in Robust Speech Recognition
Time:	Friday, May 19, 17:10 - 17:30
Presentation:	Lecture
Topic:	Speech and Spoken Language Processing: Model-based robust Speech Recognition
Title:	A Supervised Learning Approach to Uncertainty Decoding for Robust Speech Recognition
Authors:	Soundararajan Srinivasan, DeLiang Wang, The Ohio State University, United States
Abstract:	Recently several algorithms have been proposed to enhance noisy speech by estimating a binary mask that can be used to select those time-frequency regions of a noisy speech signal that contain more speech energy than noise energy. This binary mask encodes the uncertainty associated with enhanced speech in the linear spectral domain. The use of the cepstral transformation leads to a smearing of this uncertainty. We propose a supervised approach to learn the non linear transformation of the uncertainty from the linear spectral domain to the cepstral domain. This uncertainty is used by a decoder that exploits the variance associated with the enhanced cepstral features to improve robust speech recognition. Systematic evaluations on a subset of the Aurora4 task using the estimated uncertainty shows substantial improvement over the baseline performance.