Paper: | SLP-P15.7 |
Session: | Spoken Document Search, Navigation and Summarization |
Time: | Thursday, May 18, 14:00 - 16:00 |
Presentation: |
Poster
|
Topic: |
Speech and Spoken Language Processing: Speech data mining and document retrieval |
Title: |
Maximum Entropy Based Normalization of Word Posteriors for Phonetic and LVCSR Lattice Search |
Authors: |
Peng Yu, Microsoft Research Asia, China; Duo Zhang, Tsinghua University, China; Frank Seide, Microsoft Research Asia, China |
Abstract: |
In many keyword-spotting systems, the word posterior probability is an elementary quantity. This paper investigates the problem of providing "correct" posteriors, in the context of lattice-based word spotting. Unlike other work on word posteriors that focusses on relative ranking of posteriors, we emphasize relevance of the absolute value of the posterior in our user scenario. We stipulate that the posteriors should approach empirical precisions in a limit sense. Using this as a constraint, we estimate a mapping function based on Maximum Entropy. We find that for posteriors generated from phonetic lattices, mapped posteriors are satisfyingly consistent with empirical precision. In a joint search task, where different words are ranked together by posterior, FOM (Figure Of Merit) improved from 11.2% to 57.8%, which demonstrated the effectiveness of the method. Applied to searching LVCSR-based word lattices, the improvement is neglectable, but it is still effective when combining phonetic and word-lattice search in a hybrid mode, yielding an improvement from 46.7% to 65.8%. |