Technical Program

Paper Detail

Paper:	SLP-P5.3
Session:	Feature-based Robust Speech Recognition
Time:	Tuesday, May 16, 16:30 - 18:30
Presentation:	Poster
Topic:	Speech and Spoken Language Processing: Feature-based Robust Speech Recognition (e.g., noise, etc)
Title:	A PITCH-SYNCHRONOUS PEAK-AMPLITUDE BASED FEATURE EXTRACTION METHOD FOR NOISE ROBUST ASR
Authors:	Muhammad Ghulam, Junsei Horikawa, Tsuneo Nitta, Toyohashi University of Technology, Japan
Abstract:	In this paper, we propose a novel pitch-synchronous auditory-based feature extraction method for robust automatic speech recognition (ASR). A pitch-synchronous zero-crossing peak-amplitude (PS-ZCPA)-based feature extraction method was proposed previously [1,2], and showed improved performance except while modulation enhancement was integrated together with Wiener filter (WF)-based noise reduction and auditory masking into it [3]. However, since zero-crossing is not an auditory event, we propose a new pitch-synchronous peak-amplitude (PS-PA)-based method to make a feature extractor of ASR more auditory-like. We also examine the effect of WF-based noise reduction, modulation enhancement, and auditory masking into the proposed PS-PA method using Aurora-2J database. The experimental results showed the superiority of the proposed method over the PS-ZCPA method, and eliminated the problem due to the reconstruction of zero-crossings from modulated envelope. The highest relative performance over MFCC was achieved as 67.33% using the PS-PA method together with WF-based noise reduction, modulation enhancement, and auditory masking.