Subsequence similarity language models

Juan M. Huerta

doi:10.1109/ICASSP.2011.5947624

ICASSP 2011

Conference paper

18 Aug 2011

Subsequence similarity language models

View publication

Abstract

In this work we present the Subsequence Similarity Language Model (S2-LM) which is a new approach to language modeling based on string similarity. As a language model, S2-LM generates scores based on the closest matching string given a very large corpus. In this paper we describe the properties and advantages of our approach and describe efficient methods to carry out its computation. We describe an n-best rescoring experiment intended to show that S2-LM can be adjusted to behave as an n-gram SLM model. © 2011 IEEE.

Conference paper