Paper: | SLP-P16.9 |
Session: | Speaker Tracking and Adaptation |
Time: | Thursday, May 18, 16:30 - 18:30 |
Presentation: |
Poster
|
Topic: |
Speech and Spoken Language Processing: Speaker Identification |
Title: |
Nuts and Flakes: A Study of Data Characteristics in Speaker Diarization |
Authors: |
Nikki Mirghafori, Chuck Wooters, University of California, Berkeley, United States |
Abstract: |
Researchers in the speaker diarization community have observed that some audio files show unusually high Diarization Error Rates (DER) (hard to crack "nuts''), and some exhibit hyper-sensitivity to tuning parameters ("flakes''). The goal of this study is to systematically study the features that correlate with such behavior. We calculated over forty features for each of 24 shows from the Broadcast News corpus along the dimensions of speaker count, conversation turn, and speaker and show duration. We observed that number of speakers, number of turns, and do-nothing DER (a measure related to the percentage of time the dominant speaker spoke) correlated best with "nuttiness''. The do-nothing DER and number of speakers were also the best correlates of "flakiness''. |