|
计算机应用研究 2012
Representative-based distribute data stream clustering algorithm
|
Abstract:
To find the clusters of different shapes under the distributed data streams environment, this paper proposed the representative-based clustering algorithm. First, it presented the concept of circular-point based on the representative points and designed the iterative algorithm to find the density-connected circular-points, then generated the local model at the remote site. Secondly it designed the algorithm to generate global clusters by combining the local models at coordinator site. The experimental results on real and synthetic datasets demonstrate that the algorithm can find the clusters in different shapes and reduce the data transmission by using representative points, while avoiding frequently sending data through the test-update strategy.