ICASSP 2006 - May 15-19, 2006 - Toulouse, France

Technical Program

Paper Detail

Paper:SLP-L13.6
Session:Missing Data Methods in Robust Speech Recognition
Time:Friday, May 19, 18:10 - 18:30
Presentation: Lecture
Topic: Speech and Spoken Language Processing: Feature-based Robust Speech Recognition (e.g., noise, etc)
Title: SPEECH RECOGNITION IN MULTISOURCE REVERBERANT ENVIRONMENTS WITH BINAURAL INPUTS
Authors: Nicoleta Roman, The Ohio State University at Lima, United States; Soundararajan Srinivasan, DeLiang Wang, The Ohio State University, United States
Abstract: We present a binaural solution to robust speech recognition in multi-source reverberant environments. We employ the notion of an ideal time-frequency binary mask, which selects the target if it is stronger than the interference in a local time-frequency (T-F) unit. Our system estimates this ideal binary mask at the output of a target cancellation module implemented using adaptive filtering. This mask is used in conjunction with a missing-data algorithm to decode the target utterance. A systematic evaluation in terms of automatic speech recognition (ASR) performance shows substantial improvements over the baseline performance and better results over related two-microphone approaches.



IEEESignal Processing Society

©2018 Conference Management Services, Inc. -||- email: webmaster@icassp2006.org -||- Last updated Friday, August 17, 2012