Paper: | SLP-P17.5 |
Session: | Spoken Language Modeling, Identification and Characterization |
Time: | Thursday, May 18, 16:30 - 18:30 |
Presentation: |
Poster
|
Topic: |
Speech and Spoken Language Processing: Language modeling and Adaptation |
Title: |
BAYESIAN LEARNING OF N-GRAM STATISTICAL LANGUAGE MODELING |
Authors: |
Shuanhu Bai, Haizhou Li, Institute for Infocomm Research, Singapore |
Abstract: |
The n-gram language model adaptation is typically formulated using deleted interpolation under the maximum likelihood estimation framework. This paper proposes a Bayesian learning framework for n-gram statistical language model training and adaptation. By introducing a Dirichlet conjugate prior to the n-gram parameters, we formulate the deleted interpolation under maximum a posterior criterion with a Bayesian learning procedure. We study the Bayesian learning formulation for n-gram and continuous n-gram language models. The experiments on North American News Text corpus have validated the effectiveness of the proposed algorithms. |