About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICASSP 2016
Conference paper
Using continuous lexical embeddings to improve symbolic-prosody prediction in a text-to-speech front-end
Abstract
The prediction of symbolic prosodic categories from text is an important, but challenging, natural-language processing task given the various ways in which an input can be realized, and the fact that knowledge about what features determine this realization is incomplete or inaccessible to the model. In this work, we look at augmenting baseline features with lexical representations that are derived from text, providing continuous embeddings of the lexicon in a lower-dimensional space. Although learned in an unsupervised fashion, such features capture semantic and syntactic properties that make them amenable for prosody prediction. We deploy various embedding models on prominence- and phrase-break prediction tasks, showing substantial gains, particularly for prominence prediction.