About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ISCCSP 2010
Conference paper
Footprint reduction of concatenative text-to-speech synthesizers using polynomial temporal decomposition
Abstract
High quality low footprint Concatenative Text-To-Speech (CTTS) synthesizers provide a persistent challenge in the field of speech processing. The spectral parameters representing the short speech segments used in the concatenation process constitute a large portion of the required memory. In this paper we propose to use a vectorial form of Polynomial Temporal Decomposition combined with jointly optimal segmentation and polynomial order selection in order to reduce the storage required for the spectral amplitude parameters by 50%, while preserving the perceptual quality of the obtained synthesized speech. ©2010 IEEE.