High-Performance Recommender System Training Using Co-Clustering on CPU/GPU Clusters

Kubilay Atasu; Thomas Parnell; Celestine Mendler-Dünner; Michalis Vlachos; Haralampos Pozidis

doi:10.1109/ICPP.2017.46

ICPP 2017

Conference paper

01 Sep 2017

High-Performance Recommender System Training Using Co-Clustering on CPU/GPU Clusters

View publication

Abstract

Recommender systems are becoming the crystal ball of the Internet because they can anticipate what the users may want, even before the users know they want it. However, the machine-learning algorithms typically involved in the training of such systems can be computationally expensive, and often may require several days for retraining. Here, we present a distributed approach for load-balancing the training of a recommender system based on state-of-art non-negative matrix factorization principles. The approach can exploit the presence of a cluster of mixed CPUs and GPUs, and results in a 466-fold performance improvement compared with the serial CPU implementation, and a 15-fold performance improvement compared with the best previously reported results for the popular Netflix data set.

Conference paper