Paper: | SLP-L13.5 |
Session: | Missing Data Methods in Robust Speech Recognition |
Time: | Friday, May 19, 17:50 - 18:10 |
Presentation: |
Lecture
|
Topic: |
Speech and Spoken Language Processing: Feature-based Robust Speech Recognition (e.g., noise, etc) |
Title: |
BAND-INDEPENDENT MASK ESTIMATION FOR MISSING-FEATURE RECONSTRUCTION IN THE PRESENCE OF UNKNOWN BACKGROUND NOISE |
Authors: |
Wooil Kim, University of Texas, Dallas, United States; Richard Stern, Carnegie Mellon University, United States |
Abstract: |
An effective mask estimation scheme for missing-feature reconstruction is described that achieves robust speech recognition in the presence of unknown noise. In previous work on Bayesian classification for mask estimation, white noise and colored noise were used for training mask estimators. This paper, which is concerned with both the simulation of a more diverse set of background environments and with mitigating the "sparse training" problem, describes a new Bayesian mask-estimation procedure in which each frequency band is trained independently. The new method employs colored noise for training, which is obtained by partitioning each frequency subband. We also propose a re-evaluation method of voiced/unvoiced decisions to alleviate performance degradation that is caused by errors in pitch detection. Experimental results indicate that the proposed procedure in conjunction with cluster-based missing-feature imputation improves speech recognition accuracy on the Aurora 2.0 database in the presence for all types of background noise considered. |