Large vocabulary conversational speech recognition with the Extended Maximum Likelihood Linear Transformation (EMLLT) model

Jing Huang; Vaibhava Goel; Ramesh Gopinath; Brian Kingsbury; Peder Olsen; Karthik Visweswariah

ICSLP 2002

Conference paper

16 Sep 2002

Large vocabulary conversational speech recognition with the Extended Maximum Likelihood Linear Transformation (EMLLT) model

Abstract

This paper applies the recently proposed Extended Maximum Likelihood Linear Transformation (EMLLT) model in a Speaker Adaptive Training (SAT) context on the Switchboard database. Adaptation is carried out with maximum likelihood estimation of linear transforms for the means, precisions (inverse covariances) and the feature-space under the EMLLT model. This paper shows the first experimental evidence that significant word-error-rate improvements can be achieved with the EMLLT model (in both VTL and VTL+SAT training contexts) over a state-of-the-art diagonal covariance model in a difficult large-vocabulary conversational speech recognition task. The improvements were of the order of 1 % absolute in multiple scenarios.

Conference paper