About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
INTERSPEECH 2010
Conference paper
Discriminative training and unsupervised adaptation for labeling prosodic events with limited training data
Abstract
Many applications of spoken-language systems can benefit from having access to annotations of prosodic events. Unfortunately, obtaining human annotations of these events, even sensible amounts to train a supervised system, can become a laborious and costly effort. In this paper we explore applying conditional random fields to automatically label major and minor break indices and pitch accents from a corpus of recorded and transcribed speech using a large set of fully automatically-extracted acoustic and linguistic features. We demonstrate the robustness of these features when used in a discriminative training framework as a function of reducing the amount of training data. We also explore adapting the baseline system in an un-supervised fashion to a target dataset for which no prosodic labels are available, and show how, when operating at point where only limited amounts of data are available, an unsupervised approach can offer up to an additional 3% improvement. © 2010 ISCA.