BabyEars: A recognition system for affective vocalizations
Abstract
Our goal was to see how much of the affective message we could recover using simple acoustic measures of the speech signal. Using pitch and broad spectral-shape measures, a multidimensional Gaussian mixture-model discriminator classified adult-directed (neutral affect) versus infant-directed speech correctly more than 80% of the time, and classified the affective message of infant-directed speech correctly nearly 70% of the time. We confirmed previous findings that changes in pitch provide an important cue for affective messages. In addition, we found that timbre or cepstral coefficients also provide important information about the affective message. Mothers' speech was significantly easier to classify than fathers' speech, suggesting either clearer distinctions among these messages in mothers' speech to infants, or a difference between fathers and mothers in the acoustic information used to convey these messages. Our research is a step towards machines that sense the "emotional state" of a speaker. © 2002 Elsevier Science B.V. All rights reserved.