About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICASSP 2014
Conference paper
Reduction of acoustic model training time and required data passes via stochastic approaches to maximum likelihood and discriminative training
Abstract
The recent boom in use of speech recognition technology has made the access to potentially large amounts of training data easier. This, however, also constitutes a challenge in processing such large, continuously growing amount of information. Here we present a stochastic modification of traditional iterative training approach which leads to the same or even better accuracy of acoustic models and reduces the cost of processing large data sets. The algorithm relies on model updates from statistics collected on randomly selected subsets of training data. The approach is demonstrated on maximum likelihood (ML) training and on discriminative training (DT) with minimum phone error (MPE) objective function both in the feature and the model space. Based on our experiments on 30 thousand hours of mobile data, the number of data passes can be reduced to 1/5 of the original for ML training and to 1/10 for model space DT training. © 2014 IEEE.