About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
Journal of Computer and System Sciences
Paper
Maximizing data locality in distributed systems
Abstract
The effectiveness of a distributed system hinges on the manner in which tasks and data are assigned to the underlying system resources. Moreover, today's large-scale distributed systems must accommodate heterogeneity in both the offered load and in the makeup of the available storage and compute capacity. The ideal resource assignment must balance the utilization of the underlying system against the loss of locality incurred when individual tasks or data objects are fragmented among several servers. In this paper we describe this locality-maximizing placement problem and show that an optimal solution is NP-hard. We then describe a polynomial-time algorithm that generates a placement within an additive constant of two from optimal. © 2006 Elsevier Inc. All rights reserved.