Impact of load sharing on provisioning services with consistency requirements
Abstract
Providers of services such as online auctions must provision servers to respond quickly to users' requests. Assigning each service to its own set of servers can result in some servers being overloaded while others remain underloaded. In contrast, load sharing spreads all services across the sets of servers, allowing each request to each be serviced from a choice of multiple servers, relieving the busiest servers. Sharing arbitrarily across servers, however, can be costly because of the need to maintain consistency of the service across these servers. In this work we model servicing systems with consistency requirements and analyze the impact of load sharing. Our analysis of greedy algorithms provides upper bounds of the response times, and shows that greedy algorithms can reduce the busiest server's intensity to be very close to the average service intensity - a near-optimal result. We also reveal a surprising result that, when servers' average loads are equal but fluctuate over time, the average response time can be reduced by sharing load. © 2006 IEEE.