JISE


  [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]


Journal of Information Science and Engineering, Vol. 21 No. 3, pp. 579-605


Nested State-Transition Graph Data Sequencing Model With Hierarchical Taxonomy through Radix Coding


Jia-Sheng Heh, Shein-Yun Cheng and Chang-Kai Hsu 
Department of Information and Computer Engineering 
Chung Yuan Christian University 
Chungli, 320 Taiwan 
E-mail: jsheh@ice.cycu.edu.tw


    The Internet enables all electronic information to be connected through communication networks. Users access these Internet resources with different behavior models. This paper proposes a systematic data mining approach to studying users' Internet resource access actions to find out behavior models as state-transition graphs. With this state-transition graph model, it is possible to predict the future behaviors of different user communities. A series of Internet resource access actions are stored in a database of [user, resource- access-action, time] records, indicating that the user accesses the resource at the recorded time. Such access actions are treated as basic behavior elements and form an action hierarchy which possesses different levels of radix codes. For every user, the data sequence is divided into a series of transactions, and all the actions in a transaction constitute a special behavior pattern, called (inter-transaction) behavior. The behavior codes can be aggregated from their action codes and then form a behavior hierarchy. Users can be classified into communities and subgroups by their aggregated behavior codes and behavior transitions. Each community and subgroup has its own behavior models, formulated as a state-transition graph with behavior states and transition probability between behaviors. The overall mining process has been computerized and is validated here by two examples. The first example uses simulated sequential data to show how the AprioriAll algorithm and the proposed algorithm can be combined to construct a set of nested state-transition graphs. The second example applies this method to find the predictive models of a real distance education data set and checks the predictability of these models.


Keywords: Internet resource, user behavior, data sequencing, state-transition graph, hierarchical taxonomy

  Retrieve PDF document (JISE_200503_06.pdf)