JISE


  [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]


Journal of Information Science and Engineering, Vol. 36 No. 1, pp. 171-190


Bucket-Sorted Hash Join


HYUNKWANG SHIN1, BYUNG-WON ON2,+, INGYU LEE3
AND GYU SANG CHOI1,+
1Department of Information and Communication Engineering
Yeungnam University
Gyeongsan, Gyeongbuk, 38541 Korea
E-mail: shg3786@naver.com, castchoi@ynu.ac.kr

2Department of Software Convergence Engineering
Kunsan National University
Gunsan-si, Jeollabuk-do, 54150 Korea
E-mail: bwon@kunsan.ac.kr

3Sorrell College of Business
Troy University
Troy, AL 36082, USA
E-mail: inlee@troy.edu


As one of the most important operations in relational database management systems, the join operation is very time-consuming as it needs to merge related records between two tables to produce valuable data. Thus far, several join schemes have been proposed to improve the performance of the join operation, and the hybrid hash-join scheme generally shows the best performance among them. However, this scheme incurs a big overhead during the probing phase as it must scan all records across buckets in the hash table in order to find a corresponding record. In this study, we propose a new hash join scheme, called bucket-sorted hash join, which only maintains records sorted within a bucket. Our proposed scheme can significantly reduce the overhead incurred during the probing phase because all records are sorted within a bucket, and the corresponding records are easily found using a binary search. Our experiments show that the proposed scheme can improve the performance of the join operation by up to 300% in terms of the TPC-H benchmark compared to the hybrid hash join scheme. Thus, the proposed scheme is a viable alternative in hash join operations.


Keywords: database, SQL, hybrid hash join, grace hash join, TPC-H

  Retrieve PDF document (JISE_202001_10.pdf)