Short Text Clustering in Continuous Time Using Stacked Dirichlet-Hawkes Process with Inverse Cluster Frequency Prior

Avirup Saha; Balaji Ganesan

KDD 2021

Workshop paper

14 Aug 2021

Short Text Clustering in Continuous Time Using Stacked Dirichlet-Hawkes Process with Inverse Cluster Frequency Prior

Download paper

Abstract

Traditional models for short text clustering ignore the time information associated with the text documents. However, existing works have shown that temporal characteristics of streaming documents are significant features for clustering. In this paper we propose a stacked Dirichlet-Hawkes process with inverse cluster frequency prior as a simple but effective solution for the task of short text clustering using temporal features in continuous time. Based on the classical formulation of the Dirichlet-Hawkes process, our model provides an elegant, theoretically grounded and interpretable solution while performing at par with recent state of the art models in short text clustering.

Conference paper