Publication
DPDS 1987
Conference paper

Effect of skew on join performance in parallel architectures

Abstract

Skew in the distribution of values taken by an attribute is identified as a major factor that can affect the performance of parallel architectures for relational joins. The effect of skew on the performance of two parallel architectures is evaluated using analytic models. In one architecture, called database machine (DBMC), data as well as processing power are distributed; while in the other architecture, called single processor parallel input/output (SPPI), data is distributed but the processing power is concentrated in one processor. The two architectures are compared in terms of the ratio of MIPS (millions of instructions per second) used by DBMC and SPPI to deliver the same throughput and response time. In addition, the horizontal growth potential of DBMC is evaluated in terms of maximum speedup achievable by DBMC relative to SPPI response time. The MIPS ratio as well as speedup are found to be very sensitive to the amount of skew. These suggest that careful thought should be given in parallelizing database applications and in the design of algorithms and query optimizer for parallel architectures.

Date

Publication

DPDS 1987

Authors

Share