Floki: A Proactive Data Forwarding System for Direct Inter-Function Communication for Serverless Workflows
Serverless computing emerges as an architecture choice to build and run containerized data-intensive pipelines. It leaves the tedious work of infrastructure management and operations to the cloud provider, allowing developers to focus on their core business logic, decomposing their jobs into small containerized functions to be managed independently and updated flexibly. To increase platform scalability and flexibility, providers take advantage of hardware disaggregation and require inter-function communication to go through shared object storage. While object storage has advantages in terms of data persistence and recovery, it is expensive in terms of performance and resources, making it challenging for data-intensive workloads to benefit from serverless computing. In this paper, we present Floki, a data forwarding system for direct inter-function communication proactively enabling point-to-point communication between pipeline producer-consumer pairs of containerized functions through fixed-size memory buffers, pipes, and sockets. We benchmark Floki on the principal distributed systems communication patterns, considering data transfers from 1MB to 16GB. Compared with state-of-practice object storage, Floki shows up to 74.95x of end-to-end time performance increase, reducing the largest data sharing time from 12.55 to 4.33 minutes, while requiring up to 50,738x fewer disk resources, with up to roughly 96GB space release.