Publication
HPCA 2025
Conference paper

Concord: Rethinking Distributed Coherence for Software Caches in Serverless Environments

Abstract

Costly accesses to global storage substantially limit the performance of serverless functions. To mitigate this overhead, data can be cached in the memory of the nodes where functions are executed. Existing caching schemes either (1) restrict a data item to be cached in a single node, causing frequent remote reads or (2) allow a data item to be cached in multiple nodes concurrently, adding substantial overhead to maintain cache coherence. Unfortunately, current approaches are suboptimal for the access patterns present in serverless workloads, which are characterized by frequent reads to small data items, strong temporal locality, and a small number of nodes that concurrently execute functions of the same application. Driven by these insights, we propose Concord, a distributed software caching system tailored to serverless environments. Concord allows multiple copies of the same data item to be cached in different nodes concurrently, allowing each cache to satisfy local reads. To maintain coherence across software caches, Concord proposes a directory-based distributed coherence protocol. The protocol is inspired by hardware cache coherence, and is enhanced to minimize coherence traffic, reduce contention points, and be robust to node failures and frequent coherence domain changes. Further, with the Concord coherence protocol, we unlock two new capabilities in serverless environments: transactional storage accesses and transparent data aware function placement. Compared to state-of-the-art serverless caching schemes, Concord running on a 16-node cluster speeds up execution by 2.4× and improves throughput by 1.7×, while using only 6.2MB of otherwise idle application memory (i.e., 4.8% of the total application memory).