JISE


  [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16]


Journal of Information Science and Engineering, Vol. 37 No. 1, pp. 157-183


Determining the Parameters of DBSCAN Automatically Using the Multi-Objective Genetic Algorithm


ZEINAB FALAHIAZAR1, ALIREZA BAGHERI2 AND MIDIA RESHADI3
1,3Department of Computer Engineering
Science and Research Branch, Islamic Azad University
Tehran, Iran
E-mail: {zfalahiazar; reshadi}@srbiau.ac.ir

2Department of Computer Engineering
Amirkabir University of Technology
Tehran, Iran
E-mail: ar_bagheri@aut.ac.ir


Amongst clustering algorithms, density-based clustering has many advantages, including simplicity and the ability to detect clusters of different shapes and discover outliers. Nevertheless, all current density-based clustering algorithms need input parameters. These are difficult to determine and have a substantial impact on the clustering results. DBSCAN is a density-based clustering algorithm that has been widely used in different scientific fields for many years. In this study, a hybrid algorithm named MOGA-DBSCAN, which combines the DBSCAN algorithm with the multi-objective genetic algorithm (MOGA), has been proposed. In this algorithm, the clustering problem is regarded as a multi-objective optimization problem to optimize certain cluster validity indices, which indicate the goodness of the clustering solutions. In this way, the appropriate values of the DBSCAN parameters could be determined automatically. NSGA-II is used to solve this optimization problem, and a new cluster validity index based on detected outliers is used as a fitness function to increase the quality of solutions. The use of such a multi-objective algorithm optimizes several indices simultaneously and yields high-quality results. Moreover, the Delaunay triangulation algorithm, which needs no input parameters, is used to determine the initial bounds of the DBSCAN parameters to reduce the number of generations. The results indicate that the proposed algorithm has an acceptable level of accuracy in determining the DBSCAN parameters.


Keywords: density-based clustering, DBSCAN, multi-objective optimization, delaunay triangulation, cluster validity index

  Retrieve PDF document (JISE_202101_11.pdf)