Mohammed J. Zaki, Charu C. Aggarwal
Machine Learning
The problem of projected clustering was first proposed in the ACMSIGMOD Conference in 1999, and the Probabilistic Latent Semantic Indexing (PLSI) technique was independently proposed in the ACMSIGIR Conference in the same year. Since then, more than two thousand papers have been written on these problems by the database, data mining and information retrieval communities, along completely independent lines of work. In this paper, we show that these two problems are essentially equivalent, under a probabilistic interpretation to the projected clustering problem. We will show that the EM-algorithm, when applied to the probabilistic version of the projected clustering problem, can be almost identically interpreted as the PLSI technique. The implications of this equivalence are significant, in that they imply the cross-usability of many of the techniques which have been developed for these problems over the last decade. We hope that our observations about the equivalence of these problems will stimulate further research which can significantly improve the currently available solutions for either of these problems.
Mohammed J. Zaki, Charu C. Aggarwal
Machine Learning
Chun Li, Charu C. Aggarwal, et al.
SDM 2011
Guo-Jun Qi, Charu C. Aggarwal, et al.
ICDE 2013
Shiyu Chang, Guo-Jun Qi, et al.
ICDM 2014