Publication
ICASSP 2020
Conference paper

Converting Written Language to Spoken Language with Neural Machine Translation for Language Modeling

View publication

Abstract

When building a language model (LM) for spontaneous speech, the ideal situation is to have a large amount of spoken, in-domain training data. Having such abundant data, however, is not realistic. We address this problem by generating texts in spoken language from those in written language by using a neural machine translation (NMT) model. We collected faithful transcripts of fully spontaneous speech and corresponding written versions and used them as a parallel corpus to train the NMT model. We used top-k random sampling, which generates a large variety of texts of higher quality as compared to other generation methods for NMT. We indicate that the NMT model is capable of converting written texts in a certain domain to spoken texts, and that the converted texts are effective for training LMs. Our experimental results show significant improvement of speech recognition accuracy with the LMs.

Date

01 May 2020

Publication

ICASSP 2020

Share