About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICASSP 1995
Conference paper
Language model adaptation via minimum discrimination information
Abstract
Statistical language models improve the performance of speech recognition systems by providing estimates of a priori probabilities of word sequences. The commonly used trigram language models obtain the conditional probability estimate of a word given the previous two words, from a large corpus of text. The text corpus is often a collection of several small diverse segments such as newspaper articles, or conversations on different topics. Knowledge of the current topic could be utilized to adapt the general trigram language models to match that topic closely. For example, an interpolation of the general language model with one built on the topic data could be used. We first discuss the adaptation of general trigram language models to a known topic using the minimum discrimination information (MDI) method. We then present results on the Switch-board corpus which consists of telephone conversations on several topics.