Paper: | SLP-P8.1 |
Session: | Speaker Recognition: Features |
Time: | Wednesday, May 17, 10:00 - 12:00 |
Presentation: |
Poster
|
Topic: |
Speech and Spoken Language Processing: Speaker Verification |
Title: |
Speaker Verification over Handheld Devices with Realistic Noisy Speech Data |
Authors: |
Ming Ji, Queen's University, Belfast, United Kingdom; Timothy Hazen, James R. Glass, Massachusetts Institute of Technology, United States |
Abstract: |
We study speaker verification for handheld devices assuming realistic, noisy test conditions and assuming no prior knowledge of the noise characteristics. Data were recorded in office (“quiet”) and street intersection (“noisy”) environments, with the use of an internal microphone and an external headset. We assume that the speaker models are trained using the office data and tested in matched and mismatched environment/microphone conditions. Two approaches were studied, both built upon a subband feature framework: 1) a posterior union model (PUM) that focuses verification on matching subbands thereby reducing the effect of the training and testing mismatch, and 2) universal compensation (UC) that combines multi-condition training and the PUM to provide robustness to noises of arbitrary temporal-spectral characteristics. Multi-condition training using simulated noise data of different characteristics provides a “coarse” compensation for the noise, and the PUM refines the compensation by ignoring noise variations outside the given training conditions. These two models were compared to baseline systems and have shown improved robustness for realistic noisy speech data. |