Optimization of message passing services on POWER8 infiniband clusters

Sameer Kumar; Amith Mamidala; Robert Blackmore; S. S. Sharkawi; K.A. Nysal Jan; Thomas Ward

doi:10.1145/2966884.2966909

EuroMPI 2016

Conference paper

25 Sep 2016

Optimization of message passing services on POWER8 infiniband clusters

View publication

Abstract

We present scalability and performance enhancements to MPI libraries on POWER8 InfiniBand clusters. We explore optimizations in the Parallel Active Messaging Interface (PAMI) libraries. We bypass IB VERBS via low level inline calls resulting in low latencies and high message rates. MPI is enabled on POWER8 by extension of both MPICH and Open MPI to call PAMI libraries. The IBM POWER8 nodes have GPU accelerators to optimize floating throughput of the node. We explore optimized algorithms for GPU-to-GPU communication with minimal processor involvement. We achieve a peak MPI message rate of 186 million messages per second. We also present scalable performance in the QBOX and AMG applications.

Conference paper