About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICSLP 2000
Conference paper
Filterbank-based feature extraction for speech recognition and its application to voice mail transcription
Abstract
In this paper, we propose a filterbank-based technique to extract more robust and discriminative features for the application of telephony speech recognition. First, we propose an extended Lerner grouping method to approximate the shape of the Mel filters in MFCC while reducing the cross-correlation between filterbank outputs. Then we used welch processing to reduce the variance of the spectral features while retaining the spectral resolution. Finally, we describe experiments where we augment the cepstral features with formant related features, computed using an adaptive filterbank. The new features represent the trajectory of the frequency components within different formant bands. Experimental results showed that the welch processing consistently improved the word error rate on a task of large vocabulary voice mail transcription and the formant related features provide higher discriminability than the MFCC features.