About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICSP 1996
Conference paper
Building a speech database for the purpose of speaker specific speech synthesis
Abstract
This paper presents practical and theoretical work carried out in IBM Research Laboratory, during the course of a speech synthesis project. The paper deals with two separate issues. The first is the generation of a compact set of English utterances that will attain a good phonetic coverage of the language. The second issue is constructing a speaker specific database. This starts with the recording of the speaker's speech, modeling it using a highly efficient speech representation and segmenting it into phonemes. The phoneme segmentation process is performed semi-automatically, using an iterative algorithm. A customized software named SPED was developed in order to simplify and speed up the segmentation process and at the same time improve its accuracy. The objective of the methodology presented here is to generate new 'Voice Fonts' for Text to Speech systems.