About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
INTERSPEECH 2012
Conference paper
Enhancing exemplar-based posteriors for speech recognition tasks
Abstract
Posteriors generated from exemplar-based sparse representation methods are often learned to minimize reconstruction error of the feature vectors. These posteriors are not learned through a discriminative process linked to the word error rate (WER) objective of a speech recognition task. In this paper, we explore modeling exemplar-based posteriors to address this issue. We first explore posterior modeling by training a Neural Network using exemplar-based posteriors as inputs. This produces a new set of posteriors which have been learned to minimize a cross-entropy measure, and indirectly frame error rate. Second, we take the new NN posteriors and apply a tied mixture smoothing technique to these posteriors, making them more suited for a speech recognition task. On the TIMIT task, we show that using a NN model, we can improve the performance of our sparse representations by 1.3% absolute, achieving a PER of 19.0% by modeling SR posteriors with a NN. Furthermore, taking these NN posteriors and applying further smoothing techniques, we improve the PER to 18.7%, one of the best results reported in the literature on TIMIT.