About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Abstract
Unexpectedly, the rise of serverless computing has also collaterally started the ‘`democratization’' of massive-scale data parallelism. This new trend heralded by PyWren pursues to enable untrained users to execute single-machine code in the cloud at massive scale through platforms like AWS Lambda. Driven by this vision, this article presents Lithops, which carries forward the pioneering work of PyWren to better exploit the innate parallelism of la MapReduce tasks atop several Functions-as-a-Service platforms. Instead of waiting for a cluster to be up and running in the cloud, makes easy the task of spawning hundreds and thousands of cloud functions to execute a large job in a few seconds from start. With Lithops, for instance, users can painlessly perform exploratory data analysis from within a Jupyter notebook, while it is the Lithops's engine which takes care of launching the parallel cloud functions, loading dependencies, automatically partitioning the data, etc. In this article, we describe the design and innovative features of Lithops and evaluate it using several representative applications, including sentiment analysis, Monte Carlo simulations, and hyperparameter tuning. These applications manifest the Lithops ability to scale single-machine code computations to thousands of cores. And very importantly, without the need of booting a cold cluster or keeping a warm cluster for occasional tasks.