[ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 6 ] [ 7 ] [ 8 ] [ 9 ] [ 10 ] [ 11 ] [ 12 ] [ 13 ] [ 14 ] [ 15 ] [ 16 ] [ 17 ] [ 18 ]

Journal of Information Science and Engineering, Vol. 36 No. 2, pp. 441-465

A Parametric and Non-Parametric Approach for High-Accurate Outlier Detection

Massive Data Computing Laboratory
School of Computer Science and Technology
Harbin Institute of Technology
Harbin, 150001 P.R. China
E-mail: {easybah; wangzh}@hit.edu.cn

Outlier detection is an essential problem that has been studied in a wide range of applications in diverse fields. One common approach to outlier detection is using statistical models, but these methods have inherent challenges and drawbacks. For instance, in providing optimal solutions that will enable the idea of detecting outliers more effectively with a high detection rate and in minimizing the computational cost. Many statistical techniques that have been proposed are classified into mainly parametric and non-parametric methods, and to the best of our knowledge, evaluating and deciphering the effects of these methods against each other remains to be an open research direction, and most of these statistical methods proposed earlier have not shown high outlier detection accuracy. In this paper, under the umbrella and generalization of statistical approach, we propose Gaussian Mixture Model for Outlier Detection (GMMOD) for the parametric approach and Kernel Density Estimation for Outlier Detection (KDEOD) algorithms for the non-parametric approach, for solving the problem of detecting outliers more effectively and in improving the outlier detection accuracy. The proposed methods are applied to real-world datasets, and our experimental results show that even though both techniques perform well, KDEOD shows favorable by a smaller margin in most cases when compared to GMMOD and both show improved performance over their similar comparative algorithms.

Keywords: outlier detection, parametric, non-parametric, Gaussian mixture model, kernel density estimate

  Retrieve PDF document (JISE_202002_18.pdf)