Paper: | SLP-P17.4 |
Session: | Spoken Language Modeling, Identification and Characterization |
Time: | Thursday, May 18, 16:30 - 18:30 |
Presentation: |
Poster
|
Topic: |
Speech and Spoken Language Processing: Language modeling and Adaptation |
Title: |
PROFILE BASED COMPRESSION OF N-GRAM LANGUAGE MODELS |
Authors: |
Jesper Olsen, Daniela Oria, Nokia, Finland |
Abstract: |
A profile based technique for encoding and compression of n-gram language models is presented. The technique is intended to be used in combination with existing techniques for size reduction of n-gram language models such as pruning, quantisation and word class modelling. The technique is here evaluated on an embedded large vocabulary speech recognition task. When used in combination with quantisation, the technique can reduce the memory needed for storing probabilities by a factor 10 with little or no degradation in word accuracy. The structure of the language model is well suited for “best-first” type decoding styles, and is here used for guiding an isolated word recogniser. The language model structure is well suited for predicting several likely word continuations, but is computationally less suitable for efficient lookup of individual ngram probabilities. |