Sinusoidal model parameterization for HMM-based TTS system

Slava Shechtman; Alex Sorin

INTERSPEECH 2010

Conference paper

26 Sep 2010

Sinusoidal model parameterization for HMM-based TTS system

Abstract

A sinusoidal representation of speech is an alternative to the source-filter model. It is widely used in speech coding and unit-selection TTS, but is less common in statistical TTS frameworks. In this work we utilize Regularized Cepstral Coefficients (RCC) estimated in mel-frequency scale for amplitude spectrum envelope modeling within an HMM-based TTS platform. Improved subjective quality for mel-frequency RCC (MRCC) combined with the sinusoidal model based reconstruction is reported, compared to the state-of-the-art MGC-LSP parameters. © 2010 ISCA.

Paper