About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
INTERSPEECH 2010
Conference paper
Enhanced word classing for model M
Abstract
Model M is a superior class-based n-gram model that has shown improvements on a variety of tasks and domains. In previous work with Model M, bigram mutual information clustering has been used to derive word classes. In this paper, we introduce a new word classing method designed to closely match with Model M. The proposed classing technique achieves gains in speech recognition word-error rate of up to 1.1% absolute over the baseline clustering, and a total gain of up to 3.0% absolute over a Katz-smoothed trigram model, the largest such gain ever reported for a class-based language model. © 2010 ISCA.