ICASSP 2006 - May 15-19, 2006 - Toulouse, France

Technical Program

Paper Detail

Paper:SS-5.6
Session:Dealing with Intrinsic Speech Variabilities in ASR
Time:Wednesday, May 17, 15:40 - 16:00
Presentation: Special Session Lecture
Topic: Special Sessions: Dealing with intrinsic speech variabilities in ASR
Title: Fepstrum and Carrier Signal Decomposition of Speech Signals Through Homomorphic Filtering
Authors: Vivek Tyagi, Christian Wellekens, Institut Eurecom, France
Abstract: Amplitude Modulation(AM) and frequency modulation(FM) have been well defined and studied in the context of communications systems\cite{62}. Borrowing upon these ideas, several researchers have applied AM-FM\cite{58,53,63,64} modeling for speech signals with mixed results. These techniques have varied in their definition and consequently the demodulation methods used therein. In this paper, we carefully define AM and FM signals in the context of ASR. We show that for a theoretically meaningful estimation of the AM signal, it is necessary to decompose the speech signal into several narrow spectral bands as opposed to the previous use of the speech modulation spectrum\cite{58,53,63,64}, which was derived by decomposing the speech signal into increasingly wider spectral bands (such as critical, Bark or Mel). Due to the Hilbert relationships, the AM signal induces a component in the FM signal which is fully determinable from the AM signal\cite{50,100}. We present a novel homomorphic filtering technique to extract the leftover FM signal after suppressing the redundant part of the FM signal. The estimated AM message signals are downsampled and their lower DCT coefficients are retained as speech features. These features carry information that is complementary to the MFCCs. A Tandem\cite{56} combination of these two features is shown to improve recognition accuracy.



IEEESignal Processing Society

©2018 Conference Management Services, Inc. -||- email: webmaster@icassp2006.org -||- Last updated Friday, August 17, 2012