About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICDE 1984
Conference paper
Distributed query optimization: An engineering approach
Abstract
We present a novel preprocessing technique for distributed query optimization which achieves nearly all of the benefits that full database reduction by semijoin achieves, but for a small fraction of the cost. Most previous researchers have approached the problem of distributed query optimization with the idea of achieving maximum database reduction at any cost. We take an engineering approach-that, past a point of diminishing returns, database reduction is not cost-effective. We introduce a technique called 'partial reduction', which uses a sequence of 'bucket semijoins' to eliminate nearly all of the irrelevant tuples eliminated by a fully reducing sequence of semijoins. We provide a performance analysis for a simple binary semijoin compared to a bucket semijoin, and show that a bucket semijoin can achieve, say, 90% of the reduction that a semijoin can, but for, say, 10% of the cost. Although we used some simplifying assumptions in our analysis, we argue that relaxing these assumptions does not diminish our results in any way. We discuss the sensitivity of our technique to its parameters, and we show that our results improve for multiple join and inequality join queries.