About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
INTERSPEECH 2011
Conference paper
Towards high performance LVCSR in speech-to-speech translation system on smart phones
Abstract
This paper presents the endeavors to improve the performance of large vocabulary continuous speech recognition (LVCSR) in speechto- speech translation system on smart phones. A variety of techniques towards high LVCSR performance are investigated to achieve high accuracy and low latency given constrained resources. This includes one-pass streaming mode decoding for minimum latency, acoustic modeling with full-covariance based on bootstrap and model restructuring for improving recognition accuracy with limited training data; quantized discriminative feature space transforms and quantized Gaussian mixture model to reduce memory usage with negligible degradation on recognition accuracy. Some speed optimization methods are also discussed to increase the recognition speed. The proposed techniques evaluated on the DARPA Transtac datasets will be shown to give good overall performance under the constraints of both CPU and memory on smart phones. Copyright © 2011 ISCA.