Paper: | AE-P1.5 |
Session: | Loudspeaker and Microphone Array Processing |
Time: | Wednesday, May 17, 10:00 - 12:00 |
Presentation: |
Poster
|
Topic: |
Audio and Electroacoustics: Spatial and Multichannel Audio |
Title: |
ZERO-CROSSING BASED BINAURAL MASK ESTIMATION FOR MISSING DATA SPEECH RECOGNITION |
Authors: |
Young-Ik Kim, Sung An, Rhee Kil, Korea Advanced Institute of Science and Technology, Republic of Korea |
Abstract: |
This paper presents a new method of zero-crossing based binaural mask estimation for missing data speech recognition under the condition that multiple sound sources are present simultaneously. The masking is determined by the estimated directions of sound sources using the spatial cues such as inter-aural time differences (ITDs) and inter-aural intensity differences (IIDs). In the suggested method, the estimation of ITDs is utilizing the statistical properties of zero-crossings generated from binaural filter-bank outputs. We also consider the estimation of ITDs with the aid of IID samples to cope with the phase ambiguities of ITD samples in high frequencies. As a result, the proposed method is able to provide an accurate estimate of sound source directions and a good masking scheme for speech recognition while offering significantly less computational complexity compared to cross-correlation based methods. |