Leader set selection for low-latency geo-replicated state machine
Modern planetary scale distributed systems largely rely on a State Machine Replication protocol to keep their service reliable, yet it comes with a specific challenge: latency, bounded by the speed of light. In particular, clients of a single-leader protocol, such as Paxos, must communicate with the leader which must in turn communicate with other replicas: inappropriate selection of a leader may result in unnecessary round-trips across the globe. To cope with this limitation, several all-leader and leaderless alternatives have been proposed recently. Unfortunately, none of them fits all circumstances. In this article we argue that the "right" choice of the number of leaders depends on a given replica configuration and the workload. Then we present Droopy and Dripple, two sister approaches built upon state machine replication protocols. Droopy dynamically reconfigures the set of leaders. Whereas, Dripple coordinates state partitions wisely, so that each partition can be reconfigured (by Droopy) separately. Our experimental evaluation on Amazon EC2 shows that, Droopy and Dripple reduce latency under imbalanced or localized workloads, compared to their native protocol. When most requests are non-commutative, our approaches do not affect the performance of their native protocol and both outperform a state-of-the-art leaderless protocol.