Frequent pattern mining with uncertain data

Charu C. Aggarwal; Yan Li; Jianyong Wang; Jing Wang

doi:10.1145/1557019.1557030

KDD 2009

Conference paper

09 Nov 2009

Frequent pattern mining with uncertain data

View publication

Abstract

This paper studies the problem of frequent pattern mining with uncertain data. We will show how broad classes of algorithms can be extended to the uncertain data setting. In particular, we will study candidate generate-and-test algorithms, hyper-structure algorithms and pattern growth based algorithms. One of our insightful observations is that the experimental behavior of different classes of algorithms is very different in the uncertain case as compared to the deterministic case. In particular, the hyper-structure and the candidate generate-and-test algorithms perform much better than tree-based algorithms. This counter-intuitive behavior is an important observation from the perspective of algorithm design of the uncertain variation of the problem. We will test the approach on a number of real and synthetic data sets, and show the effectiveness of two of our approaches over competitive techniques. Executable and Data Sets: Available at: http://dbgroup.cs.tsinghua.edu.cn/liyan/u-mining. tar.gz. Copyright 2009 ACM.

Paper