About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICSLP 2004
Conference paper
Maximum entropy direct model as a unified model for acoustic modeling in speech recognition
Abstract
Traditional statistical models for speech recognition have been dominated by generative models such as Hidden Markov Models (HMMs). We recently proposed a new framework for speech recognition using maximum entropy direct modeling, where the probability of a state or word sequence given an observation sequence is computed directly from the model. In contrast to HMMs, features can be non-independent, asynchronous, and overlapping. In this paper, we discuss how to make the computationally intensive training of such models feasible through parallelizing the IIS (Improved Iterative Scaling) algorithm. The direct model significantly outperforms traditional HMMs in word error rate when used as stand-alone acoustic models. Modest improvements over the best HMM system are seen when combined with HMM and language model scores. The maximum entropy model can potentially incorporate non-independent features such as acoustic phonetic features in a way that is robust to missing features due to mismatch between training and testing.