Translating imperative code to MapReduce

Cosmin Radoi; Stephen J. Fink; Rodric Rabbah; Manu Sridharan

doi:10.1145/2660193.2660228

OOPSLA 2014

Conference paper

15 Oct 2014

Translating imperative code to MapReduce

View publication

Abstract

We present an approach for automatic translation of sequential, imperative code into a parallel MapReduce framework. Automating such a translation is challenging: imperative updates must be translated into a functional MapReduce form in a manner that both preserves semantics and enables parallelism. Our approach works by first translating the input code into a functional representation, with loops succinctly represented by fold operations. Then, guided by rewrite rules, our system searches a space of equivalent programs for an effective MapReduce implementation. The rules include a novel technique for handling irregular loop-carried dependencies using group-by operations to enable greater parallelism. We have implemented our technique in a tool called MOLD. It translates sequential Java code into code targeting the Apache Spark runtime. We evaluated MOLD on several real-world kernels and found that in most cases MOLD generated the desired MapReduce program, even for codes with complex indirect updates. Copyright 2014 ACM.

Paper