OALib Journal期刊
ISSN: 2333-9721
费用:99美元
|
|
|
基于闭合频繁Induced子树的GML文档结构聚类
, PP. 61-64
Keywords: 闭合频繁Induced子树,GML结构聚类,聚类
Abstract:
提出了一种GML文档结构聚类新算法MCF-CLU.与其它相关算法不同,该算法基于闭合频繁Induced子树进行聚类,聚类过程中不需树之间的两两相似度比较,而是挖掘GML文档数据库的闭合频繁Induced子树,为每个文档求一个闭合频繁Induced子树作为该文档的代表树,将具有相同代表树的文档聚为一类.聚类过程中自动生成簇的个数,为每个簇形成聚类描述,而且能够发现孤立点.实验结果表明算法MCF-CLU是有效的,且性能优于其它同类算法.
References
[1] | [ Chaw athe S S. Comparing h ie rarch ical data in ex terna lm em o ry[ C ] / / Proceed ing s o f the VLDB Conference. San Franc isco:
|
[2] | M o rgan Kaufmann Pub lishe rs Inc, 1999: 90-101.
|
[3] | [ De Francesca F, Gordano G, Orta le R, et a .l A genera l fram ewo rk fo rXML docum en t cluster ing[ R]. ICAR-CNR( Consig lio
|
[4] | Naz iona le de lle R icerche Istitu to d i Ca lco lo eRe ti ad A lte Prestazion i), 2003.
|
[5] | [ LianW, Cheung D W, M amou lis, et a .l An effic ient and sca lable a lgo rithm fo r cluster ing XML docum en ts by structure[ J].
|
[6] | IEEE Transactions on Know ledge and Data Eng inee ring, 2004, 16( 1): 82-96.
|
[7] | [ Guha S, Rastog i R, Shim K. ROCK: a robust cluster ing algorithm fo r categor ica l a ttr ibu tes[ C] / / Pro ceedings o f ICDE99( Internationa
|
[8] | l Con ference on Data Eng inee ring). Los A lam ito s: IEEE Com pute r Society, 1999: 512-521.
|
[9] | [ Dalam agas T, Cheng T, W inke lK, et a.l C lustering XML documents using structura l summ ar ies[ C] / / Cu rrentT rends in Database
|
[10] | Techno logy-EDBT 2004W orkshops. Be rlin: Spr inge r, 2004: 547-556.
|
[11] | [ Ch iY, X ia Y, Y angY, et a.l M in ing c losed and m ax ima l frequent subtrees from da tabases o f labeled rooted trees[ J]. IEEE
|
[12] | T ransactions on Know ledge and Data Eng ineer ing, 2005, 17( 2): 190-202.
|
Full-Text
|
|
Contact Us
service@oalib.com QQ:3279437679 
WhatsApp +8615387084133
|
|