JISE

[1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

Journal of Information Science and Engineering, Vol. 36 No. 1, pp. 171-190

Bucket-Sorted Hash Join

HYUNKWANG SHIN¹, BYUNG-WON ON^2,+, INGYU LEE³
AND GYU SANG CHOI^1,+
¹Department of Information and Communication Engineering
Yeungnam University
Gyeongsan, Gyeongbuk, 38541 Korea
E-mail: shg3786@naver.com, castchoi@ynu.ac.kr
²Department of Software Convergence Engineering
Kunsan National University
Gunsan-si, Jeollabuk-do, 54150 Korea
E-mail: bwon@kunsan.ac.kr
³Sorrell College of Business
Troy University
Troy, AL 36082, USA
E-mail: inlee@troy.edu

As one of the most important operations in relational database management systems, the join operation is very time-consuming as it needs to merge related records between two tables to produce valuable data. Thus far, several join schemes have been proposed to improve the performance of the join operation, and the hybrid hash-join scheme generally shows the best performance among them. However, this scheme incurs a big overhead during the probing phase as it must scan all records across buckets in the hash table in order to find a corresponding record. In this study, we propose a new hash join scheme, called bucket-sorted hash join, which only maintains records sorted within a bucket. Our proposed scheme can significantly reduce the overhead incurred during the probing phase because all records are sorted within a bucket, and the corresponding records are easily found using a binary search. Our experiments show that the proposed scheme can improve the performance of the join operation by up to 300% in terms of the TPC-H benchmark compared to the hybrid hash join scheme. Thus, the proposed scheme is a viable alternative in hash join operations.

Keywords: database, SQL, hybrid hash join, grace hash join, TPC-H

Retrieve PDF document (JISE_202001_10.pdf)