Paper

Estimation of Probabilities in the Language Model of the IBM Speech Recognition System

Abstract

The language model probabilities are estimated by an empirical Bayes approach in which a prior distribution for the unknown probabilities is itself estimated through a novel choice of data. The predictive power of the model thus fitted is compared by means of its experimental perplexity [1] to the model as fitted by the Jelinek-Mercer deleted estimator and as fitted by the Turing-Good formulas for probabilities of unseen or rarely seen events. Copyright © 1984 by The Institute of Electrical and Electronics Engineers, Inc.

Related