A comparative study on system combination schemes for LVCSR

Chengyuan Ma; Hong-Kwang Jeff Kuo; Hagen Soltau; Xiaodong Cui; Upendra Chaudhari; Lidia Mangu; Chin-Hui Lee

doi:10.1109/ICASSP.2010.5495627

ICASSP 2010

Conference paper

14 Mar 2010

A comparative study on system combination schemes for LVCSR

View publication

Abstract

We present a comparative study on combination schemes for large vocabulary continuous speech recognition by incorporating long-span class posterior probability features into conventional short-time cepstral features. System combination can improve the overall speech recognition performance when multiple systems exhibit different error patterns and multiple knowledge sources encode complementary information. A variety of combination approaches are investigated in this paper, e.g., feature concatenation single stream system, model combination multi-stream system, lattice rescoring and ROVER. These techniques work at different levels of a LVCSR system and have different computational cost. We compared their performance and analyzed their advantages and disadvantages on large vocabulary English broadcast news transcription tasks. Experimental results showed that model combination with independent tree consistently outperforms ROVER, feature concatenation and lattice rescoring. In addition, the phoneme posterior probability features do provide complementary information to short-time cepstral features. ©2010 IEEE.

Conference paper