# Inferring arbitrary distributions for data and computation

## Abstract

In the era of mult-core systems, one of the key requirements of achieving better utilization of multiple available cores is that of parallelization of code across multiple distributed nodes; this involves (re)distribution of both data and computation. Such a transformation can be a fairly tedious activity considering the possible dependencies (data, control) and interference between different segments of the code. Further, to keep the data accesses local, computation distribution requires appropriate data distribution and vice versa. And this inter-dependence between distribution of data and computation makes the problem challenging. Another important challenge in this context is that the desired distribution may not be one of the well-known distributions (such as blocked, cyclic etc), and thus reasoning about it can be non-trivial. We present a refactoring framework that can help an application developer to incrementally distribute programs in the context of distributed memory multi-core systems. Given a loop and an array accessed therein, the goal of our framework is to distribute the array based on a specified distribution for the loop (or vice versa) such that the number of remote accessed are reduced. Our framework goes beyond the well-known distributions, and can handle any arbitrary distributions. In our initial investigation, we have used our transformations on varied parallel benchmark programs and have been able to show its applicability along the expected lines. © 2010 ACM.