Skew-Insensitive Parallel Algorithms for Relational Join

Khaled Alsabti; Sanjay Ranka

Back

Journal issue

Skew-Insensitive Parallel Algorithms for Relational Join

Khaled Alsabti and Sanjay Ranka

01/01/1998

Abstract

Computer Sciences

Data skew

Hash-based algorithm

Parallel join

Relational databases

Sort-based algorithm

Join is the most important and expensive operation in relational databases. The parallel join operation is very sensitive to the presence of the data skew. In this paper, we present two new parallel join algorithms for coarse grained machines which work optimally in presence of arbitrary amount of data skew. The first algorithm is sort-based and the second is hash-based. Both of these algorithms employ a preprocessing phase (prior to the redistribution phase) to equally partition the work among the processors. The proposed algorithms have been designed for memory resident-data. However, they can be extended to disk resident-data. These algorithms are shown to be theoretically as well as practically scalable. Experimental results are provided on the IBM SP-2.

Metrics

1 Record Views

Details

Title: Skew-Insensitive Parallel Algorithms for Relational Join
Creators - without role: Khaled Alsabti
Sanjay Ranka
Identifiers: 9951659508331
Academic Unit: King Saud University
Language: English
Resource Type: Journal issue