OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

计算机应用 2009

Improved BIRCH clustering algorithm
一种改进的BIRCH聚类算法

JIANG Shen-yi,LI Xia,
蒋盛益,李霞

Keywords: BIRCH algorithm,clustering,threshold,heterogeneous attributes,data mining
BIRCH算法,聚类,阈值,混合属性数据,数据挖掘

Full-Text Cite this paper Add to My Lib

Abstract:

BIRCH algorithm is a clustering algorithm suitable for very large data sets. In the algorithm, a CF-tree is built whose all entries in each leaf node must satisfy a uniform threshold T, and the CF-tree is rebuilt at each stage by different threshold. But how to set the initial threshold and how to increase the threshold of each stage are not given. In addition, the algorithm can only work with "metric" attribute, which makes its application restrained. This paper made some improvements on BIRCH algorithm: 1) Changed CF structure so that heterogeneous attributes could be manipulated; 2) Gave a heuristic method of getting initial threshold and increasing threshold of second stage of the algorithm; 3) Discussed the algorithm's parameter B and L and found that the algorithm had equal performance when B=L, at last, gave a sound scope for B. Experimental results on public data sets show that the improved algorithm has preferable performance.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

Improved BIRCH clustering algorithm一种改进的BIRCH聚类算法

Improved BIRCH clustering algorithm
一种改进的BIRCH聚类算法