Depth first generation of long patterns

Ramesh C. Agarwal; Charu C. Aggarwal; V.V.V. Prasad

doi:10.1145/347090.347114

KDD 2000

Conference paper

20 Aug 2000

Depth first generation of long patterns

Download paper

Abstract

In this paper we present an algorithm for mining long patterns in databases. The algorithm finds large itemsets by using depth first search on a lexicographic tree of itemsets. The focus of this paper is to develop CPU-efficient algorithms for finding frequent itemsets in the cases when the database contains patterns which are very wide. We refer to this algorithm as DepthProject, and it achieves more than one order of magnitude speedup over the recently proposed MaxMiner algorithm for finding long patterns. These techniques may be quite useful for applications in areas such as computational biology in which the number of records is relatively small, but the itemsets are very long. This necessitates the discovery of patterns using algorithms which are especially tailored to the nature of such domains.

Paper