Capri: Achieving Predictable Performance in Cloud Spot Markets
Abstract
Large cloud providers offer spot instances at attractive prices to improve resource utilization, resulting in a spot market where users bid for resources and providers alter prices dynamically. As prices surpass bid values, resources may be relinquished from users with low bids. Achieving predictable performance on spot markets is challenging for data analytics workloads because they are very sensitive to preemptions due to the excessive cost of recomputations. We introduce capri, a scheduling system for running cloud data analytics in spot markets in which users may experience periods of degraded performance. capri dynamically predicts the functional relationship between bid and performance, thus helping with managing expectations and bid advice. We propose a new spot market abstraction called the bribe scheduler which delivers differentiated service levels based on bids. capri uses a prediction mechanism built on a queueing approximation of the bribe scheduler. capri dynamically estimates parameters to adapt the queueing model and provide accurate performance predictions in the face of time-varying workloads. We collect measurements using capri running two realistic workloads, imdb and tpcds, and demonstrate the accuracy of our approximation and parameter estimation methodology. We show that capri outperforms existing prediction models for spot markets and achieves a median prediction error below 3% in bursty workloads. We find that capri's service level prediction is pessimistic as users are likely to experience better performance than they should receive for their bids.