Efficient sub-optimal temporal decomposition with dynamic weighting of speech signals for coding applications

Slava Shechtman; David Malah

ICSLP 2004

Conference paper

04 Oct 2004

Efficient sub-optimal temporal decomposition with dynamic weighting of speech signals for coding applications

Abstract

The Optimized Temporal Decomposition (OTD) technique for Line Spectral Frequencies (LSF) speech envelope representation, under a MMSE criterion, has been shown to be promising for very low bit rate speech coding for storage and broadcast applications. In order to improve perceptual speech quality, a dynamically weighted OTD (DW-OTD) technique is introduced in this work. It extends the OTD by allowing temporally changing weights, so as to improve the perceived speech quality. Use of Gardner's weighted MSE with DW-OTD is found to reduce the Log Spectral Distance (LSD) measure by 0.3 dB, as compared to OTD. The original OTD algorithm delay and complexity requirements make it inappropriate for real-time speech coding. In this paper we also introduce a modification of this technique, which is suboptimal but suitable for on-line speech coding purposes, with negligible degradation of performance (of only about 0.06 dB in LSD). With the proposed techniques we were able to encode speech spectral envelopes at 300-370 bps at LSD of 2.25-2.1 dB, respectively, with a delay of just 7 frames.

Conference paper