Publication
ICSLP 1998
Conference paper

TECHNIQUES FOR CAPTURING TEMPORAL VARIATIONS IN SPEECH SIGNALS WITH FIXED-RATE PROCESSING

Abstract

Fixed-rate feature extraction which is used in most current speech recognizers is equivalent to sampling the feature trajectories at a uniform rate. Often this sampling rate is well below the Nyquist rate and thus leads to distortions in the sampled feature stream due to aliasing. In this paper we explore various techniques, ranging from simple cepstral and spectral smoothing to filtering and data-driven dimensionality expansion using Linear Discriminant Analysis (LDA), to counter aliasing and the variable rate nature of information in speech signals. Smoothing in the spectral domain results in a reduction in the variance of the short term spectral estimates which directly translates to reduction in the variances of the Gaussians in the acoustic models. With these techniques we obtain modest improvements, both in word error rate and robustness to noise, on large vocabulary speech recognition tasks.

Date

Publication

ICSLP 1998

Authors

Share