About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
KDD 2003
Conference paper
Navigating massive data sets via local clustering
Abstract
This paper introduces a scalable method for feature extraction and navigation of large data sets by means of local clustering, where clusters are modeled as overlapping neighborhoods. Under the model, intra-cluster association and external differentiation are both assessed in terms of a natural confidence measure. Minor clusters can be identified even when they appear in the intersection of larger clusters. Scalability of local clustering derives from recent generic techniques for efficient approximate similarity search. The cluster overlap structure gives rise to a hierarchy that can be navigated and queried by users. Experimental results are provided for two large text databases. Copyright 2003 ACM.