About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICASSP 2004
Conference paper
Enrollment in low-resource speech recognition systems
Abstract
In this paper we consider the problem of enrollment for low-resource speech recognition systems designed for noisy environments. Noise robustness concerns, memory and computational constraints along with the use of compact acoustic models for fast Gaussian computation make adaptation especially challenging. We derive a Maximum A Posteriori (MAP) algorithm especially designed for the fast off-line adaptation of these compact acoustic models. It requires less computation and memory than standard Feature-space Maximum Likelihood Linear Regression (FMLLR) which is another technique well suited for compact acoustic models. In our experiments of speaker enrollment for speech recognition in the car, we present a computationally efficient procedure to simulate noisy conditions with the adaptation data. In these experiments, MAP compares favorably with FMLLR in terms of recognition accuracy. Besides, combining FMLLR and MAP significantly outperforms each technique individually, thus providing an efficient alternative for systems with larger resources.