Publication
IEEE Transactions on Knowledge and Data Engineering
Paper

Combining join and semi-join operations for distributed query processing

View publication

Abstract

In this paper, we explore the approach to applying a combination of join and semi-join operations to minimize the amount of data transmission required for distributed query processing. Specifically, we identify and exploit two important concepts which occur with the use of join operations as reducers in query processing, namely, gainful semi-joins and pure join attributes. Some semi-joins, though not profitable themselves, may benefit the execution of subsequent join operations and become profitable owing to the use of join operations as reducers. Such a semi-join is termed a gainful semi-join. In addition, join attributes which are not part of the output attributes are referred to as pure join attributes. We shall not only exploit the usefulness of gainful semi-joins, but also utilize the removability of pure join attributes to reduce the amount of data transmission required for query processing. Moreover, in light of the two concepts, heuristic searches are developed to determine a sequence of join and semi-join reducers for query processing. Our results show the importance of the approach to combining joins and semi-joins for distributed query processing.

Date

Publication

IEEE Transactions on Knowledge and Data Engineering

Authors

Topics

Share