%0 Journal Article
%T A Data Placement Strategy for Parallel XML Databases
一种并行XML数据库分片策略
%A WANG Guo-Ren
%A TANG Nan
%A YU Ya-Xin
%A SUN Bing
%A YU Ge
%A
王国仁
%A 汤南
%A 于亚新
%A 孙冰
%A 于戈
%J 软件学报
%D 2006
%I
%X This paper targets on parallel XML document partitioning strategies to process XML queries in parallel. To describe the problem of XML data partitioning, a concept, intermediary node, is presented in this paper. By a set of intermediary nodes, an XML data tree can be partitioned into a root-tree and a set of sub-trees. While the root-tree is duplicated over all the nodes, the set of the sub-trees can be evenly partitioned over all the nodes based on the workload of user queries. For the same XML data tree, there are a number of intermediary nodes sets, and different intermediary nodes sets will generate different partitions. It can be evaluated if a partitioning is good based on the workload of user queries. It is obviously an NP hard problem to choose an optimal partitioning. To solve this problem, this paper proposes a set of heuristic rules. Based on the idea described above, this paper designs and implements an XML data partitioning algorithm, WIN, and the extensive experimental results show that its speedup and scaleup performances outperform the existing strategies.
%K parallel database
%K XML document
%K workload
%K data partitioning
%K intermediary node
并行数据库
%K XML文档
%K 工作负载
%K 数据分片
%K 媒介节点
%U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=8240383F08CE46C8B05036380D75B607&jid=7735F413D429542E610B3D6AC0D5EC59&aid=8A48B0788F686CBA&yid=37904DC365DD7266&vid=BCA2697F357F2001&iid=E158A972A605785F&sid=BDEE8BA20F4733DB&eid=DBEE434FCBFED297&journal_id=1000-9825&journal_name=软件学报&referenced_num=0&reference_num=18