Technical Program

Paper Detail

Paper:	SLP-P15.7
Session:	Spoken Document Search, Navigation and Summarization
Time:	Thursday, May 18, 14:00 - 16:00
Presentation:	Poster
Topic:	Speech and Spoken Language Processing: Speech data mining and document retrieval
Title:	Maximum Entropy Based Normalization of Word Posteriors for Phonetic and LVCSR Lattice Search
Authors:	Peng Yu, Microsoft Research Asia, China; Duo Zhang, Tsinghua University, China; Frank Seide, Microsoft Research Asia, China
Abstract:	In many keyword-spotting systems, the word posterior probability is an elementary quantity. This paper investigates the problem of providing "correct" posteriors, in the context of lattice-based word spotting. Unlike other work on word posteriors that focusses on relative ranking of posteriors, we emphasize relevance of the absolute value of the posterior in our user scenario. We stipulate that the posteriors should approach empirical precisions in a limit sense. Using this as a constraint, we estimate a mapping function based on Maximum Entropy. We find that for posteriors generated from phonetic lattices, mapped posteriors are satisfyingly consistent with empirical precision. In a joint search task, where different words are ranked together by posterior, FOM (Figure Of Merit) improved from 11.2% to 57.8%, which demonstrated the effectiveness of the method. Applied to searching LVCSR-based word lattices, the improvement is neglectable, but it is still effective when combining phonetic and word-lattice search in a hybrid mode, yielding an improvement from 46.7% to 65.8%.