About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
SLT 2018
Conference paper
Comparing Prosodic Frameworks: Investigating the Acoustic-Symbolic Relationship in ToBI and RaP
Abstract
ToBI is the dominant tool for symbolically describing prosodic content in American English speech material. This is due to its descriptive power and its theoretical grounding, but also to the amount of available annotated data. Recently, a modest amount of material annotated with the Rhythm and Pitch (RaP) framework was released publicly. In this paper, we investigate the acoustic-symbolic relationship under these two systems. We present experiments looking at this relationship in both directions. From acoustic to symbolic, we compare the automatic prediction of prosodic prominence as defined under the two systems. From symbolic to acoustic, we examine the utility of these annotation standards to correctly prescribe the acoustics of a given utterance from their symbolic sequences. We find RaP to be promising, showing a somewhat stronger acoustic-symbolic relationship than ToBI given a comparable amount of data for some aspects of these tasks. While with more annotated data ToBI results are stronger, it remains to be shown whether RaP performance can scale up.