About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
Computer Speech and Language
Paper
Topics in decision tree based speech synthesis
Abstract
Most modern speech synthesis systems using context dependent decision trees in their acoustic synthesis modules are unit selection style concatenative speech synthesis systems using the trees essentially as a form of pruning during their segment search. The IBM Trainable Speech Synthesis System is one such system. This paper begins by discussing the advantages and disadvantages of the decision tree and non-decision tree approaches to unit selection synthesis. It goes on to present the results of formal listening tests conducted on the IBM system to investigate a number of different topics pertinent to decision tree based systems. These include the use of extended context features during clustering, the effect of using trees with different numbers of leaves and different numbers of segments per leaf, and the performance of several different offline segment preselection algorithms.