Learning spectral embedding for semi-supervised clustering
Abstract
In recent years, semi-supervised clustering (SSC) has aroused considerable interests from the machine learning and data mining communities. In this paper, we propose a novel semi-supervised clustering approach with enhanced spectral embedding (ESE) which not only considers structure information contained in data sets but also makes use of prior side information such as pairwise constraints. Specially, we first construct a symmetry-favored k-NN graph which is highly robust to noisy objects and can reflect the underlying manifold structure of data. Then we learn the enhanced spectral embedding towards an ideal representation as consistent with the pairwise constraints as possible. Finally, through taking advantage of Laplacian regularization, we formulate learning spectral representation as semidefinite-quadratic-linear programs (SQLPs) under the squared loss function or small semidefinitive programs (SDPs) under the hinge loss function, which both can be efficiently solved. Experimental results on a variety of synthetic and real-world data sets show that our approach outperforms the state-of-the-art SSC algorithms on both vector-based and graph-based clustering. © 2011 IEEE.