# A Graph Theoretical Approach to Determine a Join Reducer Sequence in Distributed Query Processing

## Abstract

Semijoin has traditionally been relied upon to reduce the cost of data transmission for distributed query processing. However, judiciously applying join operations as reducers can lead to further reduction in the amount of data transmission required. In view of this fact, we explore in this paper the approach of using join operations as reducers in distributed query processing. We shall first show that the problem of determining a sequence of join operations for a query can be transformed to that of finding a specific type of set of cuts to the corresponding query graph, where a cut to a graph is a partition of nodes in that graph. Then, in light of this concept, we shall prove that the problem of determining the optimal sequence of join operations for a given query graph is of exponential complexity, thus justifying the necessity of applying heuristic approaches to solve this problem. By mapping the problem of determining a sequence of join reducers into the one of finding a set of cuts, we develop (for tree and general query graphs, respectively) efficient heuristic algorithms to determine a join reducer sequence for distributed query processing. The algorithms developed are based on the concept of divide and conquer and are of polynomial time complexity. Simulation is performed to evaluate these algorithms. © 1994 IEEE