Hartigan’s k-means versus Lloyd’s k-means - Is it time for a change?

Noam Slonim; Ehud Aharoni; Koby Crammer

IJCAI 2013

Conference paper

01 Dec 2013

Hartigan's k-means versus Lloyd's k-means - Is it time for a change?

Abstract

Hartigan's method for k-means clustering holds several potential advantages compared to the classical and prevalent optimization heuristic known as Lloyd's algorithm. E.g., it was recently shown that the set of local minima of Hartigan's algorithm is a subset of those of Lloyd's method. We develop a closed-form expression that allows to establish Hartigan's method for κ-means clustering with any Bregman divergence, and further strengthen the case of preferring Hartigan's algorithm over Lloyd's algorithm. Specifically, we characterize a range of problems with various noise levels of the inputs, for which any random partition represents a local minimum for Lloyd's algorithm, while Hartigan's algorithm easily converges to the correct solution. Extensive experiments on synthetic and real-world data further support our theoretical analysis.

Conference paper