|
计算机科学 2007
A Grid-based Subspace Clustering Algorithm for High-dimensional Data Streams
|
Abstract:
Based on the analysis of grid-based clustering algorithms, we propose a subspace clustering algorithm that can find clusters in different subspaces for high-dimensional data streams. The algorithm combines the advantages of bottom-up grid-based method and top-down grid-based method. A uniformly partitioned grid data structure is used to summarize the data stream online. A top-down grid partition method is used o find the subspaces in which clusters locate. Theory analysis and performance study with real datasets and synthetic dataset demonstrate the efficiency and effectiveness of our proposed algorithm.