About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Conference paper
Coupling vs. unifying: Modeling techniques for speech-to-speech translation
Abstract
As a part of our effort to develop a unified computational framework for speech-to-speech translation, so that sub-optimizations or local optimizations can be avoided, we are developing direct models for speech recognition. In direct model, the focus is on the creation of one single integrated model p( text|acoustics) rather than a complex series of artifices, therefore various factors such as linguistics and language features, speaker or speaking rate differences, different acoustic conditions, can be applied to the joint optimization. In this paper we discuss how linguistic and semantic constraints are used in phoneme recognition.