About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
INTERSPEECH 2010
Conference paper
Decoding with shrinkage-based language models
Abstract
In this paper, we investigate the use of a class-based exponential language model when directly integrated into speech recognition or machine translation decoders. Recently, a novel class-based language model, Model M, was introduced and was shown to outperform regular n-gram models on moderate amounts of Wall Street Journal data. This model was motivated by the observation that shrinking the sum of the parameter magnitudes in an exponential language model leads to better performance on unseen data. In this paper we directly integrate the shrinkage-based language model into two different state-of-the-art machine translation engines as well as a large-scale dynamic speech recognition decoder. Experiments on standard GALE and NIST development and evaluation sets show considerable and consistent improvement in both machine translation quality and speech recognition word error rate. © 2010 ISCA.