Parallel sparse matrix vector product with OpenMP for SMPs in Code-Saturne

V. Szeremi; L. Anton; C. Evangelinos; C. Moulinec; Y. Fournier

Paper

01 Jan 2015

Parallel sparse matrix vector product with OpenMP for SMPs in Code-Saturne

Abstract

In this paper a new blocked sparse matrix vector product parallel algorithm based on Code Saturne native matrix format is proposed in order to improve the OpenMP scalability. New sparse matrix storage options based on the native matrix format, and corresponding algorithms, are implemented in Code Saturne. In addition, traceguided optimisations for reduced synchronisation and better load balance are proposed and their efficiency is investigated on different processor architectures. Results are presented for a range of systems, including architectures of PRACE Tier-0 machines, IBMBlue Gene/Q and iDataPlex (Sandybridge, Ivybridge) and Cray XC30 (Ivybridge). Initial results indicate that the new algorithm has a significantly better parallel performance across the tested hardware with respect to the native OpenMP sparse matrix vector product algorithm.

Workshop paper