ICASSP 2006 - May 15-19, 2006 - Toulouse, France

Technical Program

Paper Detail

Paper:SLP-P16.2
Session:Speaker Tracking and Adaptation
Time:Thursday, May 18, 16:30 - 18:30
Presentation: Poster
Topic: Speech and Spoken Language Processing: Clustering and novel modeling algorithms
Title: FAST AND ROBUST SPEAKER CLUSTERING USING THE EARTH MOVER'S DISTANCE AND MIXMAX MODELS
Authors: Thilo Stadelmann, Bernd Freisleben, University of Marburg, Germany
Abstract: Speaker clustering is the task of assigning a unique label to all speech segments in a video uttered by the same speaker. There are two key challenges: processing speed and robustness in the presence of noise. In this paper, we present an approach to significantly improve the processing speed of a hierarchical speaker clustering algorithm by using the earth mover's distance (EMD) as the distance measure. By extending the well-known MIXMAX speaker model such that the EMD can be applied, noise robustness is achieved. Experimental results show that the runtime of the proposed EMD approach decreases by more than factor of 120 compared to a likelihood ratio based distance measure while the clustering performance remains nearly the same.



IEEESignal Processing Society

©2018 Conference Management Services, Inc. -||- email: webmaster@icassp2006.org -||- Last updated Friday, August 17, 2012