全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

K-means algorithm based on data field
一种基于数据场的K-均值算法

Keywords: K-means,interaction force among molecules,data field,text clustering
K-均值
,分子间相互作用力,数据场,文本聚类

Full-Text   Cite this paper   Add to My Lib

Abstract:

K-means algorithm has several limitations: choosing initial class centre of divisions was random, too sensitive to noises and outliers, divisions had a great difference in shape was not applicable. To against the deficiency, drawing on the experience of molecular interaction model with the text simulated as data point in the data field and considering the overall similarity and difference of texts, this paper proposed a new formula to compute the data potential. The formula could get rid of the outliers and determine the initial class centre according to the potential of document data. Experiments show that improved K-means algorithm can get higher convergence rate, eliminate the bad impact of noise and outliers on the clustering results and improve the precision of the clustering. So, the improved K-means algorithm is well suited to the non-uniform subject distributions.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133