Catching the response time tail in the cloud

As modern service systems are pressured to provide competitive prices via cost-effective capacity planning, especially in the paradigm of cloud computing, service level agreements (SLAs) end up becoming ever more sophisticated, i.e., fulfilling targets of different percentiles of response times. However, it is no mean feat to predict even the average response times of real systems, or even abstracted queueing systems that typically simplify system details, and it gets even more complicated when trying to manage SLAs defined by various percentiles of response times.

To efficiently capture these different percentiles, we first develop a novel and autonomic methodology – termed Burst Based Simulation, which combines burst profiling on real systems with complex, state-dependent simulations. Moreover, based on our methodology, we construct an analysis on SLA management: the prediction of SLA violations given a certain request pattern. We evaluate our approach on two types of service systems, virtualized and bare-metal, with wide ranges of SLAs and traffic loads. Our evaluation results show that our methodology is able to achieve an average error below 15% when predicting different response time percentiles, and accurately capture SLA violations.

Share this post

Recommended for You

Strong excitonic-plasmonic coupling in hybrid system of metal nanoparticles and J-aggregates of organic dye

Formal verification of the Lowe modified BAN concrete Andrew Secure RPC protocol

In-fiber pulse reshaping in the C-band

Relay-assisted OFDM for NLOS ultraviolet communication