About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
INTERSPEECH - Eurospeech 1999
Conference paper
ADAPTATION TO ENVIRONMENT AND SPEAKER USING MAXIMUM LIKELIHOOD NEURAL NETWORKS
Abstract
When there is a mismatch between training and testing conditions, statistical speech recognition algorithms suffer from severe degradation in recognition accuracy. The mismatch could be due to the interference from acoustical environments where systems are actually used or from speakers themselves. In this paper, a neural network based transformation approach is studied to handle the data distribution mismatches between training and testing conditions. The conditional probability that comes from hidden Markov model (HMM) based recognizers is used for the objective function of a neural network. It maximizes the likelihood of the data from a testing environment, and allows global optimization of the network when used with HMM-based recognizers. The new objective function can be used to transform speech feature vectors, or the mean vectors and covariance matrices of a recognizer. The proposed algorithm is evaluated on a noisy distant-talking version of the Resource Management database.