Publication
CLOUD 2021
Conference paper

Performance Evaluation of Data-Centric Workloads in Serverless Environments

View publication

Abstract

Serverless computing is a cloud-based execution paradigm that allows provisioning resources on-demand, freeing developers from infrastructure management and operational concerns. It typically involves deploying workloads as stateless functions that take no resources when not in use, and is meant to scale transparently. To make serverless effective, providers impose limits on a per-function level, such as maximum duration, fixed amount of memory, and no persistent local storage. These constraints make it challenging for data-intensive workloads to take advantage of serverless because they lead to sharing significant amounts of data through remote storage. In this paper, we build a performance model for serverless workloads that considers how data is shared between functions, including the amount of data and the underlying technology that is being used. The model's accuracy is assessed by running a real workload in a cluster using Knative, a state-of-The-Art serverless environment, showing a relative error of 5.52%. With the proposed model, we evaluate the performance of data-intensive workloads in serverless, analyzing parallelism, scalability, resource requirements, and scheduling policies. We also explore possible solutions for the data-sharing problem, like using local memory and storage. Our results show that the performance of data-intensive workloads in serverless can be up to 4.32= faster depending on how these are deployed.

Date

01 Sep 2021

Publication

CLOUD 2021

Share