Technical Program

Paper Detail

Paper:	SPTM-L3.5
Session:	Applications to Speech and Audio
Time:	Tuesday, May 16, 17:50 - 18:10
Presentation:	Lecture
Topic:	Signal Processing Theory and Methods: Signal Restoration, Reconstruction, and Enhancement
Title:	Generalized Optimal Multi-Microphone Speech Enhancement Using Sequential Minimum Variance Distortionless Response(MVDR) Beamforming and Postfiltering
Authors:	Lae-Hoon Kim, Mark Hasegawa-Johnson, University of Illinois at Urbana-Champaign, United States; Koeng-Mo Sung, Seoul National University, Republic of Korea
Abstract:	A theoretical basis for optimal multichannel speech enhancementis presented, sufficient, flexible to be used with any assumed statistical model and optimality criterion. Any Bayesian optimal one-channel estimator for speech enhancement can be generalized to the multichannel case as a sequentially constructed minimum variance distortionless response (MVDR) beamformer followed by an optimal one-channel postfilter. We present experimental results using the minimum mean-square error log-spectral amplitude (MMSE-logSA) optimality criterion, applied to a statistical model with simplified channel but realistic inter-microphone noise coherence. Word error rate in the audio-visual speech in a car (AVICAR) corpus (moving car, windows open) is reduced from 18% to 9%.