Technical Program

Paper Detail

Paper:	SLP-P3.5
Session:	Novel LVCSR Algorithms
Time:	Tuesday, May 16, 14:00 - 16:00
Presentation:	Poster
Topic:	Speech and Spoken Language Processing: Alternative Statistical and Machine Learning Methods for General ASR (e.g., no-HMM methods)
Title:	A NEW DATA SELECTION APPROACH FOR SEMI-SUPERVISED ACOUSTIC MODELING
Authors:	Rong Zhang, Alexander Rudnicky, Carnegie Mellon University, United States
Abstract:	Current approaches to semi-supervised incremental learning prefer to select unlabeled examples predicted with high confidence for model re-training. However, this strategy can degrade the classifica-tion performance rather than improve it. We present an analysis for the reasons of this phenomenon, showing that only relying on high confidence for data selection can lead to an erroneous estimate to the true distribution when the confidence annotator is highly corre-lated with the classifier in the information they use. We propose a new data selection approach to address this problem and apply it to a variety of applications, including machine learning and speech recognition. Encouraging improvements in recognition accuracy are observed in our experiments.