Using site-level modeling to evaluate the performance of parallel system schedulers

Edi Shmueli; Dror G. Feitelson

MASCOTS 2006

Conference paper

01 Dec 2006

Using site-level modeling to evaluate the performance of parallel system schedulers

Abstract

The conventional performance evaluation methodology for parallel system schedulers uses an open model to generate the workloads used in simulations. In many cases recorded workload traces are simply played back, assuming that they are reliable representatives of real workloads, and leading to the expectation that the simulation results actually predict the scheduler's true performance. We show that the lack of feedback in these workloads results in performance prediction errors, which may reach hundreds of percents. We also show that load scaling, as currently performed, further ruins the representativeness of the workload, by generating conditions which cannot exist in a real environment. As an alternative, we suggest a novel sitelevel modeling evaluation methodology, in which we model not only the actions of the scheduler but also the activity of users who generate the workload dynamically. This advances the simulation in a manner that reliably mimics feedback effects found in real sites. In particular, saturation is avoided because the generation of additional work is throttled when the system is overloaded. While our experiments were conducted in the context of parallel scheduling, the idea of site-level simulation is applicable to many other types of systems. © 2006 IEEE.

Conference paper