Multiple query scheduling for distributed semantic caches

Beomseok Nam; Minho Shin; Henrique Andrade; Alan Sussman

doi:10.1016/j.jpdc.2010.02.002

JPDC

Paper

01 May 2010

Multiple query scheduling for distributed semantic caches

View publication

Abstract

In distributed query processing systems, load balancing plays an important role in maximizing system throughput. When queries can leverage cached intermediate results, improving the cache hit ratio becomes as important as load balancing in query scheduling, especially when dealing with computationally expensive queries. The scheduling policies must be designed to take into consideration the dynamic contents of the distributed caching infrastructure. In this paper, we propose and discuss several distributed query scheduling policies that directly consider the available cache contents by employing distributed multidimensional indexing structures and an exponential moving average approach to predicting cache contents. These approaches are shown to produce better query plans and faster query response times than traditional scheduling policies that do not predict dynamic contents in distributed caches. We experimentally demonstrate the utility of the scheduling policies using MQO, which is a distributed, Grid-enabled, multiple query processing middleware system we developed to optimize query processing for data analysis and visualization applications. © 2010 Elsevier Inc.

Talk