About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICASSP 2010
Conference paper
Applying log linear model based context dependent machine translation techniques to Grapheme-to-Phoneme conversion
Abstract
Grapheme-to-Phoneme conversion is a challenging task for speech recognition and text-to-speech systems for which the functionality of automatically predicting pronunciations for OOV words is highly desirable. In this paper, Grapheme-to-Phoneme conversion is viewed as a special case of sequence translation problem and we propose to tackle it with phrase based log-linear translation model. We improve standard machine translation method by utilizing context dependent units which lead to a better many-to-many alignment between chunks of graphemes and phonemes. Furthermore, hypotheses combination technique is applied to combine outputs generated by multiple translation models trained with different alignment units. Our proposed approach was evaluated on NetTalk and CMUDict datasets. Significant improvements on conversion accuracy are observed on both sets compared to conventional translation method: phoneme level error rates are reduced relatively by 18.4% and 22.5%, respectively. Our approach also performs better than or as good as previously published data driven methods examined on the same tasks. ©2010 IEEE.