About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
INTERSPEECH - Eurospeech 2001
Conference paper
Generating F0 contours by statistical manipulation of natural F0 shapes
Abstract
This paper proposes a method of generating F0 contours from natural F0 segmental shapes for speech synthesis. The extracted shapes of F0 units are basically kept unchanged, by eliminating any averaging operation in the analysis phase and minimizing modification operations in the synthesis phase. The use of "kept-unchanged" F0 shapes has a great potential to incorporate a wide variety of speaking styles in the same framework, including not only read-out speech, but also dialogue and emotive speech. A linear-regression statistical model is proposed here to "manipulate" the stored raw F0 shapes for building them up to a sentential F0 contour. Through experimental evaluations, the proposed model turns out to provide a robust F0 contour prediction. By using the model, linguistically derived information of a sentence can be directly mapped, in a purely data-driven manner, to acoustic F0 values of the sentential intonation contour for a trained speaker.