About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICASSP 2011
Conference paper
Multi-view and multi-objective semi-supervised learning for large vocabulary continuous speech recognition
Abstract
Current hidden Markov acoustic modeling for large vocabulary continuous speech recognition (LVCSR) relies on the availability of abundant labeled transcriptions. Given that speech labeling is both expensive and time-consuming while there is a huge amount of unlabeled data easily available nowadays, semi-supervised learning (SSL) from both labeled and unlabeled data which aims to reduce the development cost for LVCSR becomes more important than ever. In this paper, we propose SSL for LVCSR by using the multiple views learned from different acoustic features and randomized decision trees. In addition, we develop the multi-objective learning of HMM-based acoustic models by optimizing a hybrid criterion which is established by the combination of the discriminative mutual information from labeled data and the entropy from unlabeled data. Experiments conducted on Broadcast News show the benefits of proposed methods. © 2011 IEEE.