Topics in decision tree based speech synthesis

R. Donovan

doi:10.1016/S0885-2308(02)00031-1

Computer Speech and Language

Paper

01 Jan 2003

Topics in decision tree based speech synthesis

View publication

Abstract

Most modern speech synthesis systems using context dependent decision trees in their acoustic synthesis modules are unit selection style concatenative speech synthesis systems using the trees essentially as a form of pruning during their segment search. The IBM Trainable Speech Synthesis System is one such system. This paper begins by discussing the advantages and disadvantages of the decision tree and non-decision tree approaches to unit selection synthesis. It goes on to present the results of formal listening tests conducted on the IBM system to investigate a number of different topics pertinent to decision tree based systems. These include the use of extended context features during clustering, the effect of using trees with different numbers of leaves and different numbers of segments per leaf, and the performance of several different offline segment preselection algorithms.

Conference paper