Estimating Hidden Markov Model Parameters So As To Maximize Speech Recognition Accuracy
Abstract
This paper is concerned with the problem of estimating the parameter values of hidden Markov word models for speech recognition. It is argued that maximum-likelihood estimation of the parameters via the forward—backward algorithm may not lead to values which maximize recognition accuracy. The paper describes an alternative estimation procedure called corrective training which is aimed at minimizing the number of recognition errors. Corrective training is similar to a well-known error-correcting training procedure for linear classifiers and works by iteratively adjusting the parameter values so as to make correct words more probable and incorrect words less probable. There are strong parallels between corrective training and maximum mutual information estimation; the relationship of these two techniques is discussed and a comparison is made of their performance. Although it has not been proved that the corrective training algorithm converges, experimental evidence suggests that it does, and that it leads to fewer recognition errors than can be obtained with conventional training methods. © 1993 IEEE