Kernel methods match deep neural networks on TIMIT
Po-Sen Huang, Haim Avron, et al.
ICASSP 2014
Shrinkage-based exponential language models, such as the recently introduced Model M, have provided significant gains over a range of tasks [1]. Training such models requires a large amount of computational resources in terms of both time and memory. In this paper, we present a distributed training algorithm for such models based on the idea of cluster expansion [2]. Cluster expansion allows us to efficiently calculate the normalization and expectations terms required for Model M training by minimizing the computation needed between consecutive n-grams. We also show how the algorithm can be implemented in a distributed environment, greatly reducing the memory required per process and training time. © 2011 IEEE.
Po-Sen Huang, Haim Avron, et al.
ICASSP 2014
Bhuvana Ramabhadran, Jing Huang, et al.
INTERSPEECH - Eurospeech 2003
Asaf Rendel, Raul Fernandez, et al.
ICASSP 2016
Tara N. Sainath, Avishy Carmi, et al.
ICASSP 2010