Paper: | SLP-P5.3 |
Session: | Feature-based Robust Speech Recognition |
Time: | Tuesday, May 16, 16:30 - 18:30 |
Presentation: |
Poster
|
Topic: |
Speech and Spoken Language Processing: Feature-based Robust Speech Recognition (e.g., noise, etc) |
Title: |
A PITCH-SYNCHRONOUS PEAK-AMPLITUDE BASED FEATURE EXTRACTION METHOD FOR NOISE ROBUST ASR |
Authors: |
Muhammad Ghulam, Junsei Horikawa, Tsuneo Nitta, Toyohashi University of Technology, Japan |
Abstract: |
In this paper, we propose a novel pitch-synchronous auditory-based feature extraction method for robust automatic speech recognition (ASR). A pitch-synchronous zero-crossing peak-amplitude (PS-ZCPA)-based feature extraction method was proposed previously [1,2], and showed improved performance except while modulation enhancement was integrated together with Wiener filter (WF)-based noise reduction and auditory masking into it [3]. However, since zero-crossing is not an auditory event, we propose a new pitch-synchronous peak-amplitude (PS-PA)-based method to make a feature extractor of ASR more auditory-like. We also examine the effect of WF-based noise reduction, modulation enhancement, and auditory masking into the proposed PS-PA method using Aurora-2J database. The experimental results showed the superiority of the proposed method over the PS-ZCPA method, and eliminated the problem due to the reconstruction of zero-crossings from modulated envelope. The highest relative performance over MFCC was achieved as 67.33% using the PS-PA method together with WF-based noise reduction, modulation enhancement, and auditory masking. |