About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICPR 2008
Conference paper
K-means clustering of proportional data using L1 distance
Abstract
We present a new L1-distance-based k-means clustering algorithm to address the challenge of clustering high-dimensional proportional vectors. The new algorithm explicitly incorporates proportionality constraints in the computation of the cluster centroids, resulting in reduced L1 error rates. We compare the new method to two competing methods, an approximate L1- distance k-means algorithm, where the centroid is estimated using cluster means, and a median L1 k-means algorithm, where the centroid is estimated using cluster medians, with proportionality constraints imposed by normalization in a second step. Application to clustering of projects based on distribution of labor hours by skill illustrates the advantages of the new algorithm. © 2008 IEEE.