Decoding with shrinkage-based language models

Ahmad Emami; Stanley Chen; Abraham Ittycheriah; Hagen Soltau; Bing Zhao

INTERSPEECH 2010

Conference paper

26 Sep 2010

Decoding with shrinkage-based language models

Abstract

In this paper, we investigate the use of a class-based exponential language model when directly integrated into speech recognition or machine translation decoders. Recently, a novel class-based language model, Model M, was introduced and was shown to outperform regular n-gram models on moderate amounts of Wall Street Journal data. This model was motivated by the observation that shrinking the sum of the parameter magnitudes in an exponential language model leads to better performance on unseen data. In this paper we directly integrate the shrinkage-based language model into two different state-of-the-art machine translation engines as well as a large-scale dynamic speech recognition decoder. Experiments on standard GALE and NIST development and evaluation sets show considerable and consistent improvement in both machine translation quality and speech recognition word error rate. © 2010 ISCA.

Paper