About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICASSP 2006
Conference paper
Towards pooled-speaker concatenative text-to-speech
Abstract
In this paper we explore the merging of data from various speakers in building a concatenative text-to-speech system, First, we investigate the pooling of data from multiple speakers for building statistical models to predict pitch and duration, and present listening test results which show that the expressiveness of our ITS system is improved using these techniques. Additionally, we describe an experiment in which we merged databases from several speakers to form an enlarged database from which our concatenative text-to-speech system draws segments. We present listening test results which show that pooling data from several speakers yields higher quality synthetic speech in general domains than restricting ourselves to the data from just one speaker in our repertoire. © 2006 IEEE.