%0 Journal Article
%T K-means clustering algorithm based on optimal initial centers related to pattern distribution of samples in space
基于样本空间分布密度的初始聚类中心优化K-均值算法*
%A XIE Juan-ying
%A GUO Wen-juan
%A XIE Wei-xin
%A GAO Xin-bo
%A
谢娟英
%A 郭文娟
%A 谢维信
%A 高新波
%J 计算机应用研究
%D 2012
%I
%X To overcome the sensible of traditional K-means clustering algorithm to initial centers, and avoid the arbitrary of available improved K-means algorithms for discovering good initial centers, this paper proposed a new algorithm to find the optimal initial centers for K-means clustering algorithm. It defined the density and the neighborhood for each sample according to the natural pattern distribution of exemplars in data space, so that the samples chose as initial seeds not only lie in the higher density area, but also far away from each other. It tested the new algorithm on some well-known datasets from UCI machine learning repository and on some synthetic datasets with different proportion noises using many different measures. The experimental results demonstrate that our new algorithm achieves excellent clustering result in short run time and is insensible to noisy data. It outperforms the traditional K-means clustering algorithm and those available algorithms for improving the initial seeds of K-means clustering algorithm.
%K clustering
%K K-means clustering
%K initial centers
%K neighborhood
%K density of pattern distribution
聚类
%K K-均值聚类
%K 初始中心
%K 邻域
%K 样本分布密度
%U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=8240383F08CE46C8B05036380D75B607&jid=A9D9BE08CDC44144BE8B5685705D3AED&aid=F950C016BC33E90154DB7D24FC5B625A&yid=99E9153A83D4CB11&vid=771469D9D58C34FF&iid=38B194292C032A66&sid=68BCD01D0D745EB3&eid=CC5564FFEBD22614&journal_id=1001-3695&journal_name=计算机应用研究&referenced_num=0&reference_num=18