About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
HPDC 2006
Conference paper
Improving resource matching through estimation of actual job requirements
Abstract
Heterogeneous clusters and grid infrastructures are becoming increasingly popular. In these computing infrastructures, machines have different resources (e.g., memory sizes, disk space, and installed software packages). These differences give rise to a problem of over-provisioning, that is, sub-optimal utilization of a cluster due to users requesting resource capacities greater than what their jobs actually need. Our analysis of a real workload file (LANL CM5) revealed differences of up to two orders of magnitude between requested memory capacity and actual memory usage. The problem of over-provisioning has received very little attention so far. We discuss different approaches for applying machine learning methods to estimate the actual resource capacities used by jobs. These approaches are independent of the scheduling policies and the dynamic resource-matching schemes used. Our simulations show that these methods can yield an improvement of over 50% in utilization (throughput) of heterogeneous clusters. © 2006 IEEE.