Publication
IEEE Workshop on Speech Synthesis 2002
Conference paper

Statistic prosody structure prediction

View publication

Abstract

Hierarchical prosody structure generation is a key component for a speech synthesis system. This paper presents a statistic method that predicts the prosody structure for the Chinese text-to-speech (TTS) system by combining a dynamic program method with the rules. The method is based on a manually annotated corpus extracted from the natural speech (IBM Mandarin TTS Corpus for Female 02). The experimental results show that an accuracy of 91.2% for predicting prosodic structure can be achieved. A state-of-the-art Mandarin TTS system is worked out based on the hierarchical prosody structure. Listening tests show that the prosody structure works pretty well.

Date

Publication

IEEE Workshop on Speech Synthesis 2002

Authors

Share