Reduction of acoustic model training time and required data passes via stochastic approaches to maximum likelihood and discriminative training

Petr Novák; Roman Otec; Antonio Lee; Vaibhava Goel

doi:10.1109/ICASSP.2014.6854670

ICASSP 2014

Conference paper

04 May 2014

Reduction of acoustic model training time and required data passes via stochastic approaches to maximum likelihood and discriminative training

View publication

Abstract

The recent boom in use of speech recognition technology has made the access to potentially large amounts of training data easier. This, however, also constitutes a challenge in processing such large, continuously growing amount of information. Here we present a stochastic modification of traditional iterative training approach which leads to the same or even better accuracy of acoustic models and reduces the cost of processing large data sets. The algorithm relies on model updates from statistics collected on randomly selected subsets of training data. The approach is demonstrated on maximum likelihood (ML) training and on discriminative training (DT) with minimum phone error (MPE) objective function both in the feature and the model space. Based on our experiments on 30 thousand hours of mobile data, the number of data passes can be reduced to 1/5 of the original for ML training and to 1/10 for model space DT training. © 2014 IEEE.

Conference paper