JISE


  [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]


Journal of Information Science and Engineering, Vol. 28 No. 2, pp. 263-297


Subtree Reconstruction, Query Node Intervals and Tree Pattern Query Evaluation


YANGJUN CHEN AND YIBIN CHEN
Department of Applied Computer Science 
University of Winnipeg 
Winnipeg, Manitoba, Canada R3b 2E9 
E-mail: y.chen@uwinnipeg.ca; chenyibin@gmail.com


    Since the extensible markup language XML emerged as a new standard for information representation and exchange on the Internet, the problem of storing, indexing, and querying XML documents has been among the major issues of database research. In this paper, we study the tree pattern matching and discuss a new algorithm for processing ordered tree pattern queries, by which not only ancestor/descendant relationships, but also left-to-right ordering of query nodes are considered. Such kind of tree matching has many applications in practice, such as the linguistic analysis, the video content-based retrieval, as well as the computational biology and the data mining. The time complexities of the new algorithm is bounded by O(|D|.|Q| + |T|. leafQ) and its space overhead is by O(leafT . leafQ), where T stands for a document tree, Q for a tree pattern and D is the largest data stream among all the data streams associated with the nodes in Q. Each data stream contains the database nodes that match the predicate at a node qleafT (leafQ) represents the number of the leaf nodes of T (resp. Q). In addition, the algorithm can be adapted to an indexing environment with XB-trees being used. Experiments have been conducted, which shows that our algorithm is promising.


Keywords: XML documents, tree pattern queries, tree matching, tree encoding, XB-trees

  Retrieve PDF document (JISE_201202_03.pdf)