About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICASSP 2010
Conference paper
A comparative study on system combination schemes for LVCSR
Abstract
We present a comparative study on combination schemes for large vocabulary continuous speech recognition by incorporating long-span class posterior probability features into conventional short-time cepstral features. System combination can improve the overall speech recognition performance when multiple systems exhibit different error patterns and multiple knowledge sources encode complementary information. A variety of combination approaches are investigated in this paper, e.g., feature concatenation single stream system, model combination multi-stream system, lattice rescoring and ROVER. These techniques work at different levels of a LVCSR system and have different computational cost. We compared their performance and analyzed their advantages and disadvantages on large vocabulary English broadcast news transcription tasks. Experimental results showed that model combination with independent tree consistently outperforms ROVER, feature concatenation and lattice rescoring. In addition, the phoneme posterior probability features do provide complementary information to short-time cepstral features. ©2010 IEEE.