Publication
VLDB 1997
Conference paper
Parallel algorithms for high-dimensional proximity joins
Abstract
We consider the problem of parallelizing high-dimensional proximity joins. We present a parallel multidimensional join algorithm based on an the epsilon-kdB tree and compare it with the more common approach of space partitioning. An evaluation of the algorithms on an IBM SP2 shared-nothing multiprocessor is presented using both synthetic and real-life datasets. We also examine the effectiveness of the algorithms in the context of a specific data-mining problem, that of finding similar time-series. The empirical results show that our algorithm exhibits good performance and scalability, as well an ability to handle data-skew.