Improving resource matching through estimation of actual job requirements

Elad Yom-Tov; Yariv Aridor

HPDC 2006

Conference paper

01 Dec 2006

Improving resource matching through estimation of actual job requirements

Abstract

Heterogeneous clusters and grid infrastructures are becoming increasingly popular. In these computing infrastructures, machines have different resources (e.g., memory sizes, disk space, and installed software packages). These differences give rise to a problem of over-provisioning, that is, sub-optimal utilization of a cluster due to users requesting resource capacities greater than what their jobs actually need. Our analysis of a real workload file (LANL CM5) revealed differences of up to two orders of magnitude between requested memory capacity and actual memory usage. The problem of over-provisioning has received very little attention so far. We discuss different approaches for applying machine learning methods to estimate the actual resource capacities used by jobs. These approaches are independent of the scheduling policies and the dynamic resource-matching schemes used. Our simulations show that these methods can yield an improvement of over 50% in utilization (throughput) of heterogeneous clusters. © 2006 IEEE.

Paper