About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
INTERSPEECH 2007
Conference paper
Combining length distribution model with decision tree in prosodic phrase prediction
Abstract
In Text-to-Speech (TTS) systems, prosody phrase prediction is important for the naturalness and intelligibility of synthesized voice. Statistic methods, such as dynamic programming (DP), decision tree (DT), maximum entropy (ME), etc, have been considered for the task. Features based on syntactic and lexical information are widely used. However, the predicted prosody phrases are often observed to have unrealistic length due to the lack of length distribution modeling. This paper proposes a novel algorithm to incorporate the length distribution model in prosody phrase prediction. Rather than directly use phrase length as a feature of DT or ME, the algorithm exploits the correlation between the length and the possibility given by a decision tree. Experiments show that the recalling rate and precise rate are improved 16.37% and 14.05% relatively by using the proposed algorithm.