Power of redundancy: Designing partial replication for multi-tier applications
Abstract
Replicating redundant requests has been shown to be an effective mechanism to defend application performance from high capacity variability - the common pitfall in the cloud. While the prior art centers on single-tier systems, it still remains an open question how to design replication strategies for distributed multi-tier systems, where interference from neighboring workloads is entangled with complex tier interdependency. In this paper, we design a first of its kind PArtial REplication system, sPARE, that replicates and dispatches read-only workloads for multi-tier web applications, determining replication factors per tier. The two key components of sPARE are (i) the variability-aware replicator that coordinates the replication levels on all tiers via an iterative searching algorithm, and (ii) the replication-aware arbiter that uses a novel token-based arbitration algorithm (TAD) to dispatch requests in each tier. We evaluate sPARE on web serving and web searching applications, i.e., MediaWiki and Solr, deployed on our private cloud testbed. Our results based on various interference patterns and traffic loads show that sPARE is able to improve the tail latency of MediaWiki and Solr by a factor of almost 2.7x and 2.9x, respectively.