Online ℓ1-dictionary learning with application to novel document detection

Shiva Prasad Kasiviswanathan; Huahua Wang; Arindam Banerjee; Prem Melville

NeurIPS 2012

Conference paper

01 Dec 2012

Online ℓ₁-dictionary learning with application to novel document detection

Abstract

Given their pervasive use, social media, such as Twitter, have become a leading source of breaking news. A key task in the automated identification of such news is the detection of novel documents from a voluminous stream of text documents in a scalable manner. Motivated by this challenge, we introduce the problem of online ℓ1-dictionary learning where unlike traditional dictionary learning, which uses squared loss, the '1-penalty is used for measuring the reconstruction error. We present an efficient online algorithm for this problem based on alternating directions method of multipliers, and establish a sublinear regret bound for this algorithm. Empirical results on news-stream and Twitter data, shows that this online ℓ1- dictionary learning algorithm for novel document detection gives more than an order of magnitude speedup over the previously known batch algorithm, without any significant loss in quality of results.

Conference paper