Publication
ICS 2006
Conference paper

Scaling MPI to short-memory MPPs such as BG/L

View publication

Abstract

Scalability to large number of processes is one of the weaknesses of current MPI implementations. Standard implementations are able to scale to hundreds of nodes, but not beyond. The main problem in these implementations is that they assume some resources (for both data and control-data) will always be available to receive/process unexpected messages. As we will show, this is not always true, especially in short-memory machines like the BG/L that has 64K nodes but each node only has 512Mbytes of memory.The objective of this paper is to present one algorithm that improves the robustness of MPI implementations for short-memory MPPs, taking care of data and control-data reception, the system will scale up to any number of nodes. The proposed solution achieves this goal without any observable overhead when there are no memory problems. Furthermore, in the worst case, when memory resources are extremely scarce, the overhead will never double the execution time (and we should never forget that in this extreme situation, traditional MPI implementations would fail to execute). Copyright (c) 2006 ACM.

Date

Publication

ICS 2006

Authors

Topics

Share