Publication
ICASSP 2006
Conference paper

Maximum conditional mutual information weighted scoring for speech recognition

Abstract

This paper describes a novel approach for extending the proto-type Gaussian mixture model used in representing different classes in many recognition or classification systems and its application to large vocabulary automatic speech recognition (ASR). This is achieved by estimating weighting vectors to the log likelihood values due to different elements in the feature vector. This approach estimates the weighting vectors which maximize an estimate of the conditional mutual information between the log likelihood score and a binary random variable representing whether the log likelihood is estimated using the model of the correct label or not. It is shown in the paper that under some assumptions on the conditional probability density function (PDF) of the log likelihood score given this random variable, maximizing the differential entropy of a normalized log likelihood score is an equivalent criterion. This approach allows emphasizing different features, in the acoustic feature vector used in the system, for different hidden Markov model (HMM) states. In this paper, we apply this approach to the RT04 Arabic broadcast news speech recognition task. Compared to the baseline system, 3% relative improvement in the word error rate (WER) is obtained. © 2006 IEEE.

Date

Publication

ICASSP 2006

Authors

Share