About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICASSP 1990
Conference paper
Classifying words for improved statistical language models
Abstract
A method for assigning a word to many classes based on the context in which the word occurs is presented. A trigram language model is used to determine the classes which are called statistical synonyms for that word. This classification method is used to build an adaptive language model that incorporates unknown words after their first occurrence by using their statistical synonyms in determining the model's probabilities for the added words. It is shown that the dynamic coverage of the language model increases significantly with a rather low perplexity on the added words.