Technical Program

Paper Detail

Paper:	SLP-P11.8
Session:	Front-end For Robust Speech Recognition
Time:	Wednesday, May 17, 16:30 - 18:30
Presentation:	Poster
Topic:	Speech and Spoken Language Processing: End-point detection and barge-in methods
Title:	Auto-segmentation based partitioning and clustering approach to robust endpointing
Authors:	Yu Shi, Frank K. Soong, Jian-Lai Zhou, Microsoft Research Asia, China
Abstract:	An auto segmentation based partitioning and clustering approach to robust Voice Activity Detection (VAD) is proposed. It is done in two successive steps: homogeneous frame partitioning and segment clustering. The first step, due to its auto segmentation nature, does not need a noise model, and is applicable to different noise types and SNR's. The algorithm is a dynamic programming based procedure and provides a graceful performance in finding segmentation thresholds. Multiple parameters like energy, pitch and voicing information can be easily incorporated into the procedure. The algorithm is evaluated on the test sets in the Aurora2 database. The algorithm shows its robustness at low SNR operating environments; the endpoint estimate errors are shown to have small variance.