High-performance low-latency speech recognition via multi-layered feature streaming and fast Gaussian computation

Liang Gu; Jian Xue; Xiaodong Cui; Yuqing Gao

INTERSPEECH 2008

Conference paper

01 Dec 2008

High-performance low-latency speech recognition via multi-layered feature streaming and fast Gaussian computation

Abstract

Highly accurate speech recognition with very low latency is a big challenge but also an important requirement for modern real-time speech recognition applications such as speech-to-speech translation. We attack this problem by proposing a highly effective and efficient streaming mode decoding scheme. A novel multi-layered feature streaming method is introduced to minimize truncation errors during streaming by optimizing look-ahead parameters. A set of speed-up algorithms are further proposed to speed up both Gaussian computation and graph search. Experiments show dramatic reduction in decoding latency using the proposed decoding scheme, with high recognition accuracy similar to utterance based decoding. Copyright © 2008 ISCA.

Paper