Enrollment in low-resource speech recognition systems

Sabine Deligne; Satya Dharanipragada

ICASSP 2004

Conference paper

28 Sep 2004

Enrollment in low-resource speech recognition systems

Abstract

In this paper we consider the problem of enrollment for low-resource speech recognition systems designed for noisy environments. Noise robustness concerns, memory and computational constraints along with the use of compact acoustic models for fast Gaussian computation make adaptation especially challenging. We derive a Maximum A Posteriori (MAP) algorithm especially designed for the fast off-line adaptation of these compact acoustic models. It requires less computation and memory than standard Feature-space Maximum Likelihood Linear Regression (FMLLR) which is another technique well suited for compact acoustic models. In our experiments of speaker enrollment for speech recognition in the car, we present a computationally efficient procedure to simulate noisy conditions with the adaptation data. In these experiments, MAP compares favorably with FMLLR in terms of recognition accuracy. Besides, combining FMLLR and MAP significantly outperforms each technique individually, thus providing an efficient alternative for systems with larger resources.

Conference paper