[ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 6 ] [ 7 ] [ 8 ] [ 9 ] [ 10 ] [ 11 ] [ 12 ] [ 13 ]

Journal of Information Science and Engineering, Vol. 35 No. 4, pp. 787-804

RCUHP-SM: A Rules Generation and Clustering based Uncovering Hidden Patterns in Social Media

1Department of Computer Science and Engineering
Mohamed Sathak Engineering College
Tamil Nadu, 623806 India

2Department I/C, Department of Computer Science and Engineering
Kamaraj College of Engineering and Technology
Tamil Nadu, 626001 India

In recent days, uncovering the hidden patterns from social media is an important and essential task. For this purpose, some of pattern mining techniques are proposed in the traditional works. But, it has some drawbacks include vagueness of termination criteria, lack of interpretability, may extract the meaningless patterns and cannot adapt any constraints within the time interval. In order to overcome these issues, this paper proposed a Rule Generation and Clustering based Uncovering Hidden Patterns in Social Media (RCUHP-SM) technique to uncover the hidden patterns. The main aim of this technique is to analyze, observe and understand the human behavior. At first, the customer review dataset is given as the input and it will be preprocessed by eliminating the irrelevant and unwanted attributes. After that, the descriptive sentences are extracted from the preprocessed data and its score is calculated by counting the tagged words. It is based on the positive, negative and neutral reviews of the user of each product. Then, a set of rules from R1 to R27 is framed to predict the category of review. Consequently, the threshold value is calculated to create the cluster groups into least similar, moderately similar and most similar. Then, it will be labeled as C1 to C6 based on its category. In the analysis phase, the features are extracted from the product description and it's corresponding score is computed. Based on the score, the features are sorted and analyzed for the recommendation. In this work, the novelty is presented in rule generation, similarity computation, threshold based cluster formation and analysis stages. In experiments, the performance of the proposed uncovering hidden pattern system is evaluated and compared in terms of Mean Absolute Precision (MAP), Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) measures.

Keywords: hidden pattern mining, stop words removal, parts of speech (POS) tagging, rule generation, threshold based clustering, similarity computation, filtering and feature analysis

  Retrieve PDF document (JISE_201904_05.pdf)