About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICASSP 2007
Conference paper
Joint morphological-lexical language modeling (JMLLM) for Arabic
Abstract
Language modeling for inflected languages such as Arabic poses new challenges for speech recognition due to rich morphology. The rich morphology results in large increases in perplexity and out-of-vocabulary (OOV) rate. In this study, we present a new language modeling method that takes advantage of Arabic morphology by combining morphological segments with the underlying lexical items and additional available information sources with regards to morphological segments and lexical items within a single joint model. Joint representation and modeling of morphological and lexical items reduces the OOV rate and provides smooth probability estimates. Preliminary experiments detailed in this paper show satisfactory improvements over word and morpheme based trigram language models and their interpolations. © 2007 IEEE.