JISE


  [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24]


Journal of Information Science and Engineering, Vol. 26 No. 2, pp. 461-483


HIC: A Robust and Efficient Hyper-Image-Based Clustering for Very Large Datasets


KUN-CHE LU AND DON-LIN YANG
Department of Information Engineering and Computer Science 
Feng Chia University 
Taichung, 407 Taiwan


    Most existing clustering approaches not only require several scans of a dataset but also have a very high computational cost. In this paper, we propose a novel, efficient, and effective clustering framework which requires only one scan of the input dataset. In the beginning, the original dataset is transformed and merged into a hyper-image. After that, the dissimilarities between data points are measured, once and for all, by using various image-processing methodologies. Then, image segmentation techniques are applied to extract clusters from the hyper-image. The resulting clusters can be further processed to achieve fuzzy and/or hierarchical clustering effects. Moreover, the proposed framework can cluster incrementally and even dynamically with only one scan of the updated records. With this capability, it can also be used to effectively cluster streaming data. Experimental results show that our approach is robust and stable under various parameter settings and data distributions, and it is more powerful and sophisticated than other methodologies.


Keywords: clustering framework, image processing, fuzzy set, hierarchical clustering, dynamic clustering

  Retrieve PDF document (JISE_201002_09.pdf)