A method for style adaptation to spontaneous speech by using a semi-linear interpolation technique
Abstract
This paper deals with a method for adapting a language model created from written-text corpora to spontaneous speech by using a semi-linear interpolation technique. Sizes and topic coverages of spoken language corpora are usually far smaller those of written-text corpora. We propose an approach to adapt a base language model to the styles of spontaneous speech on the basis of the following assumptions. The words that are topic-independent, that is to say, common in spontaneous speech should be predicted mainly by a model created from spontaneous speech corpora (style model), while the base model is more reliable for predicting topic-related words, because they are difficult to predict from a model based on a small corpus. We classified all words into disfluencies and normal words. The normal words are classified into two more categories; common words and topic words according to mutual information. For each category, the qualified models (base or style) with the optimal weights for linear interpolation are selected. In other words, a different linear combination of the models is used for each category of a predicted word. We conducted experiments by using a spoken-language corpus of Japanese for creating the style model. We achieved 159.1 in test-set perplexity compared with the baseline of 189.3 (simple linear interpolation) and the perplexity of the style specific model, which was 230.7.