About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
Journal of Parallel and Distributed Computing
Paper
Backfilling with lookahead to optimize the packing of parallel jobs
Abstract
The utilization of parallel computers depends on how jobs are packed together: if the jobs are not packed tightly, resources are lost due to fragmentation. The problem is that the goal of high utilization may conflict with goals of fairness or even progress for all jobs. The common solution is to use backfilling, which combines a reservation for the first job in the interest of progress with packing of later jobs to fill in holes and increase utilization. However, backfilling considers the queued jobs one at a time, and thus might miss better packing opportunities. We propose the use of dynamic programming to find the best packing possible given the current composition of the queue, thus maximizing the utilization on every scheduling step. Simulations of this algorithm, called lookahead optimizing scheduler (LOS), using trace files from several IBM SP parallel systems, show that LOS indeed improves utilization, and thereby reduces the mean response time and mean slowdown of all jobs. Moreover, it is actually possible to limit the lookahead depth to about 50 jobs and still achieve essentially the same results. Finally, we experimented with selecting among alternative sets of jobs that achieve the same utilization. Surprising results indicate that choosing the set at the head of the queue does not necessarily guarantee best performance. Instead, repeatedly selecting the set with the maximal overall expected slowdown boosts performance when compared to all other alternatives checked. © 2005 Elsevier Inc. All rights reserved.