About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICSLP 2004
Conference paper
Efficient sub-optimal temporal decomposition with dynamic weighting of speech signals for coding applications
Abstract
The Optimized Temporal Decomposition (OTD) technique for Line Spectral Frequencies (LSF) speech envelope representation, under a MMSE criterion, has been shown to be promising for very low bit rate speech coding for storage and broadcast applications. In order to improve perceptual speech quality, a dynamically weighted OTD (DW-OTD) technique is introduced in this work. It extends the OTD by allowing temporally changing weights, so as to improve the perceived speech quality. Use of Gardner's weighted MSE with DW-OTD is found to reduce the Log Spectral Distance (LSD) measure by 0.3 dB, as compared to OTD. The original OTD algorithm delay and complexity requirements make it inappropriate for real-time speech coding. In this paper we also introduce a modification of this technique, which is suboptimal but suitable for on-line speech coding purposes, with negligible degradation of performance (of only about 0.06 dB in LSD). With the proposed techniques we were able to encode speech spectral envelopes at 300-370 bps at LSD of 2.25-2.1 dB, respectively, with a delay of just 7 frames.