Technical Program

Paper Detail

Paper:	AE-P3.12
Session:	Audio Coding, Network Audio and Multimedia Applications
Time:	Thursday, May 18, 10:00 - 12:00
Presentation:	Poster
Topic:	Audio and Electroacoustics: Audio for Multimedia
Title:	A Noise-Robust FFT-Based Spectrum for Audio Classification
Authors:	Wei Chu, Benoit Champagne, McGill University, Canada
Abstract:	Recently, an early auditory model that calculates a so-called auditory spectrum, has been employed in audio classification where excellent performance is reported along with robustness in noisy environment. Unfortunately, this early auditory model is characterized by high computational requirements and the use of nonlinear processing. In this paper, inspired by the inherent self-normalization property of the early auditory model, we propose a simplified FFT-based spectrum which is noise-robust in audio classification. To evaluate the comparative performance of the proposed FFT-based spectrum, a three-class (i.e., speech, music and noise) audio classification task is carried out wherein a support vector machine (SVM) is employed as the classifier. Compared to a conventional FFT-based spectrum, both the original auditory spectrum and the proposed self-normalized FFT-based spectrum show more robust performance in noisy test cases. Test results also indicate that the performance of the self-normalized FFT-based spectrum is close to that of the original auditory spectrum, while its computational complexity is significantly lower.