全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Improved BIRCH clustering algorithm
一种改进的BIRCH聚类算法

Keywords: BIRCH algorithm,clustering,threshold,heterogeneous attributes,data mining
BIRCH算法
,聚类,阈值,混合属性数据,数据挖掘

Full-Text   Cite this paper   Add to My Lib

Abstract:

BIRCH algorithm is a clustering algorithm suitable for very large data sets. In the algorithm, a CF-tree is built whose all entries in each leaf node must satisfy a uniform threshold T, and the CF-tree is rebuilt at each stage by different threshold. But how to set the initial threshold and how to increase the threshold of each stage are not given. In addition, the algorithm can only work with "metric" attribute, which makes its application restrained. This paper made some improvements on BIRCH algorithm: 1) Changed CF structure so that heterogeneous attributes could be manipulated; 2) Gave a heuristic method of getting initial threshold and increasing threshold of second stage of the algorithm; 3) Discussed the algorithm's parameter B and L and found that the algorithm had equal performance when B=L, at last, gave a sound scope for B. Experimental results on public data sets show that the improved algorithm has preferable performance.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133