Paper: | SPTM-L3.5 |
Session: | Applications to Speech and Audio |
Time: | Tuesday, May 16, 17:50 - 18:10 |
Presentation: |
Lecture
|
Topic: |
Signal Processing Theory and Methods: Signal Restoration, Reconstruction, and Enhancement |
Title: |
Generalized Optimal Multi-Microphone Speech Enhancement Using Sequential Minimum Variance Distortionless Response(MVDR) Beamforming and Postfiltering |
Authors: |
Lae-Hoon Kim, Mark Hasegawa-Johnson, University of Illinois at Urbana-Champaign, United States; Koeng-Mo Sung, Seoul National University, Republic of Korea |
Abstract: |
A theoretical basis for optimal multichannel speech enhancementis presented, sufficient, flexible to be used with any assumed statistical model and optimality criterion. Any Bayesian optimal one-channel estimator for speech enhancement can be generalized to the multichannel case as a sequentially constructed minimum variance distortionless response (MVDR) beamformer followed by an optimal one-channel postfilter. We present experimental results using the minimum mean-square error log-spectral amplitude (MMSE-logSA) optimality criterion, applied to a statistical model with simplified channel but realistic inter-microphone noise coherence. Word error rate in the audio-visual speech in a car (AVICAR) corpus (moving car, windows open) is reduced from 18% to 9%. |