Estimation of Probabilities in the Language Model of the IBM Speech Recognition System

Arthur Nádas

doi:10.1109/TASSP.1984.1164378

IEEE Transactions on Acoustics, Speech, and Signal Processing

Paper

01 Jan 1984

Estimation of Probabilities in the Language Model of the IBM Speech Recognition System

View publication

Abstract

The language model probabilities are estimated by an empirical Bayes approach in which a prior distribution for the unknown probabilities is itself estimated through a novel choice of data. The predictive power of the model thus fitted is compared by means of its experimental perplexity [1] to the model as fitted by the Jelinek-Mercer deleted estimator and as fitted by the Turing-Good formulas for probabilities of unseen or rarely seen events. Copyright © 1984 by The Institute of Electrical and Electronics Engineers, Inc.

Paper