Publication
ICSLP 2002
Conference paper
Large vocabulary conversational speech recognition with the Extended Maximum Likelihood Linear Transformation (EMLLT) model
Abstract
This paper applies the recently proposed Extended Maximum Likelihood Linear Transformation (EMLLT) model in a Speaker Adaptive Training (SAT) context on the Switchboard database. Adaptation is carried out with maximum likelihood estimation of linear transforms for the means, precisions (inverse covariances) and the feature-space under the EMLLT model. This paper shows the first experimental evidence that significant word-error-rate improvements can be achieved with the EMLLT model (in both VTL and VTL+SAT training contexts) over a state-of-the-art diagonal covariance model in a difficult large-vocabulary conversational speech recognition task. The improvements were of the order of 1 % absolute in multiple scenarios.