Adaptation of large vocabulary recognition system parameters

L.R. Bahl; P.V. De Souza; David Nahamoo; M.A. Picheny; Salim Roukos

doi:10.1109/ICASSP.1992.225868

ICASSP 1992

Conference paper

23 Mar 1992

Adaptation of large vocabulary recognition system parameters

View publication

Abstract

This paper reports on a series of experiments in which the Hidden Markov Model baseforms and the language model probabilities were updated from spontaneously dictated speech captured during recognition sessions with the IBM Tangora system. The basic technique for baseform modification consisted of constructing new fenonic baseforms for all recognized words. To modify the language model probabilities, a simplified version of a cache language model was implemented. The word error rate across six talkers was 3.7%. Baseform adaptation reduced the average error rate to 3.5%, and employing the cache language model reduced the error rate to 3.2%. Combining both techiques further reduced the error rate to 3.1% - a respectable improvement over the original error rate, especially given that the system was speaker-trained prior to adaptation.

Conference paper