|
计算机应用研究 2011
Distributed K-means clustering by learning data density in local peer
|
Abstract:
The distributed clustering algorithm over the P2P (peer-to-peer) network can share the time and space complexity equally to each peer with utilizing computing and storage capacitates in them, as well as the bandwidth of the network. It overcomes the limitation of traditional central clustering algorithms in processing distributed data and makes it possible to process and analyze mass distributed data. This paper presented a distributed K-means clustering algorithm based on the confidence radius in local peer. The algorithm calculated the data density in local peer to find the dense and sparse distribution in the same cluster, which was used to deduce the confidence radius to guide the next clustering processing. Experimental results show that the algorithm can effectively reduce the number of iterations and save network bandwidth. Meanwhile, the clustering results in this algorithm are closed to those in the centralized clustering algorithm.