Probabilistic text modeling with Orthogonalized topics

Enpeng Yao; Guoqing Zheng; Ou Jin; Shenghua Bao; Kailong Chen; Zhong Su; Yong Yu

doi:10.1145/2600428.2609471

SIGIR 2014

Conference paper

06 Jul 2014

Probabilistic text modeling with Orthogonalized topics

View publication

Abstract

Topic models have been widely used for text analysis. Previous topic models have enjoyed great success in mining the latent topic structure of text documents. With many efforts made on endowing the resulting document-topic distributions with different motivations, however, none of these models have paid any attention on the resulting topic-word distributions. Since topic-word distribution also plays an important role in the modeling performance, topic models which emphasize only the resulting document-topic representations but pay less attention to the topic-term distributions are limited. In this paper, we propose the Orthogonalized Topic Model (OTM) which imposes an orthogonality constraint on the topic-term distributions. We also propose a novel model fitting algorithm based on the generalized Expectation-Maximization algorithm and the Newthon- Raphson method. Quantitative evaluation of text classification demonstrates that OTM outperforms other baseline models and indicates the important role played by topic orthogonalizing. Copyright 2014 ACM.

Conference paper