About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICASSP 2001
Conference paper
Automatic generation and selection of multiple pronunciations for dynamic vocabularies
Abstract
In this paper, we present a new scheme for the acoustic modeling of speech recognition applications requiring dynamic vocabularies. It applies especially to the acoustic modeling of out-of-vocabulary words which need to be added to a recognition lexicon based on the observation of a few (say one or two) speech utterances of these words. Standard approaches to this problem derive a single pronunciation from each speech utterance by combining acoustic and phone transition scores. In our scheme, multiple pronunciations are generated from each speech utterance of a word to enroll by varying the relative weights assigned to the acoustic and phone transition models. In our experiments, the use of these multiple baseforms dramatically outperforms the standard approach with a relative decrease of the word error rate ranging from 20% to 40% on all our test sets.