Linear time algorithms for clustering problems in any dimensions

Amit Kumar; Yogish Sabharwal; Sandeep Sen

doi:10.1007/11523468_111

ICALP 2005

Conference paper

11 Jul 2005

Linear time algorithms for clustering problems in any dimensions

View publication

Abstract

We generalize the k-means algorithm presented by the authors [14] and show that the resulting algorithm can solve a larger class of clustering problems that satisfy certain properties (existence of a random sampling procedure and tightness). We prove these properties for the k-median and the discrete k-means clustering problems, resulting in O(2(k/ε(1) dn) time (1 + ε)-approximation algorithms for these problems. These are the first algorithms for these problems linear in the size of the input (nd for n points in d dimensions), independent of dimensions in the exponent, assuming k and ε to be fixed. A key ingredient of the k-median result is a (1 + ε)-approximation algorithm for the 1-median problem which has running time O(2(1/ε) d). The previous best known algorithm for this problem had linear running time. © Springer-Verlag Berlin Heidelberg 2005.

Conference paper