JISE


  [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]


Journal of Information Science and Engineering, Vol. 27 No. 6, pp. 1855-1870


Impact of Behavior Clustering on Web Surfer Behavior Prediction


MALIK TAHIR HASSAN AND ASIM KARIM
Department of Computer Science 
LUMS School of Science and Engineering 
Lahore, 54792 Pakistan


    We investigate Web surfer behavior prediction by building generative and discriminative models on the entire history of navigation paths and on behavior clustering of the history. The underlying question that we try to answer is: Does behavior clustering improve behavior prediction? For behavior clustering, we adapt the k-modes clustering algorithm by incorporating a new similarity measure that gives greater weight to matches at the beginning of the navigation path. The initial cluster representatives are selected from the set of most dissimilar paths which also fixes the number of clusters. For generative prediction, we adopt Markov chain Bayesian classification models whereas for discriminative prediction we build SVM models. Experiments are performed on two realworld data sets. Surprisingly, the results show that behavior clustering has no significant impact on Web surfer behavior prediction. We also investigate the impact of time of visit, the number of relevant clusters used in prediction models, and the use of cluster modes on Web surfer behavior prediction. We find that for limited scope data simpler approaches such as prediction using cluster modes can produce highly accurate predictions (less than 1% drop from the best prediction) with greater efficiency.


Keywords: clustering, navigation path prediction, order weighted similarity, sequence prediction, web usage mining, generative discriminative models

  Retrieve PDF document (JISE_201106_04.pdf)