Applying log linear model based context dependent machine translation techniques to Grapheme-to-Phoneme conversion

Rong Zhang; Bowen Zhou

doi:10.1109/ICASSP.2010.5495551

ICASSP 2010

Conference paper

14 Mar 2010

Applying log linear model based context dependent machine translation techniques to Grapheme-to-Phoneme conversion

View publication

Abstract

Grapheme-to-Phoneme conversion is a challenging task for speech recognition and text-to-speech systems for which the functionality of automatically predicting pronunciations for OOV words is highly desirable. In this paper, Grapheme-to-Phoneme conversion is viewed as a special case of sequence translation problem and we propose to tackle it with phrase based log-linear translation model. We improve standard machine translation method by utilizing context dependent units which lead to a better many-to-many alignment between chunks of graphemes and phonemes. Furthermore, hypotheses combination technique is applied to combine outputs generated by multiple translation models trained with different alignment units. Our proposed approach was evaluated on NetTalk and CMUDict datasets. Significant improvements on conversion accuracy are observed on both sets compared to conventional translation method: phoneme level error rates are reduced relatively by 18.4% and 22.5%, respectively. Our approach also performs better than or as good as previously published data driven methods examined on the same tasks. ©2010 IEEE.

Conference paper