Technical Program

Paper Detail

Paper:	SLP-P12.6
Session:	Speech Processing for Reverberation, Quantization and Enhancement
Time:	Thursday, May 18, 10:00 - 12:00
Presentation:	Poster
Topic:	Speech and Spoken Language Processing: Speech Perception and Psychoacoustics
Title:	Speech Enhancement using Transient Speech Components
Authors:	Charturong Tantibundhit, J. Robert Boston, Ching-Chung Li, John D. Durrant, Susan Shaiman, Kristie Kovacyk, Amro A. El-Jaroudi, University of Pittsburgh, United States
Abstract:	This paper describes an algorithm to decompose speech into tonal, transient, and residual components. The algorithm uses an MDCT-based hidden Markov chain model to isolate the tonal component and a wavelet-based hidden Markov tree model to isolate the transient component. We suggest that the auditory system, like the visual system, is probably sensitive to abrupt stimulus changes and that the transient component in speech may be particularly critical to speech perception. To test this suggestion, the transient component isolated by our algorithm was selectively amplified and recombined with the original speech to generate enhanced speech, with energy adjusted to be equal to the energy of the original speech. The intelligibility of the original and enhanced speech was evaluated in eleven human subjects by the modified rhyme protocol. The word recognition rates show that the enhanced speech can provide substantial improvement in speech intelligibility at low SNR levels (8% at -15 dB, 14% at -20dB, and 18% at -25 dB).