JISE


  [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]


Journal of Information Science and Engineering, Vol. 29 No. 6, pp. 1151-1169


PTree: Mining Sequential Patterns Efficiently in Multiple Data Streams Environment


GUANLING LEE, YI-CHUN CHEN AND KUO-CHE HUNG
Department of Computer Science and Information Engineering
National Dong Hwa University
Hualien, 974 Taiwan
E-mail: guanling@mail.ndhu.edu.tw

 


    Although issues of data streams have been widely studied and utilized, it is nevertheless challenging to deal with sequential mining of data streams. In this paper, we assume that the transaction of a user is partially coming and that there is no auxiliary for buffering and integrating. We adopt the Path Tree for mining frequent sequential patterns over data streams and integrate the user¡¦s sequences efficiently. Algorithms with regards to accuracy (PAlgorithm) and space (PSAlgorithm) are proposed to meet the different aspects of users, although GAlgorithm for mining frequent sequential patterns with a gap limitation is proposed. Many pruning properties are used to further reduce the space usage and improve the accuracy of our algorithms. We also prove that PAlgorithm mine frequent sequential patterns with the approximate support of error guarantee. Through thoughtful experiments, synthetic and real datasets are utilized to verify the feasibility of our algorithms.


Keywords: data mining, multiple data streams, sequential patterns, frequent patterns, knowledge discovery

  Retrieve PDF document (JISE_201306_06.pdf)