全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

一种基于密度的空间数据流在线聚类算法

DOI: 10.3724/SP.J.1004.2012.01051, PP. 1051-1059

Keywords: 空间数据挖掘,聚类数据流,基于密度的聚类,在线算法,噪声处理

Full-Text   Cite this paper   Add to My Lib

Abstract:

?为了解决空间数据流中任意形状簇的聚类问题,提出了一种基于密度的空间数据流在线聚类算法(On-linedensity-basedclusteringalgorithmforspatialdatastream,OLDStream),该算法在先前聚类结果上聚类增量空间数据,仅对新增空间点及其满足核心点条件的邻域数据做局部聚类更新,降低聚类更新的时间复杂度,实现对空间数据流的在线聚类.OLDStream算法具有快速处理大规模空间数据流、实时获取全局任意形状的聚类簇结果、对数据流的输入顺序不敏感、并能发现孤立点数据等优势.在真实数据和合成数据上的综合实验验证了算法的聚类效果、高效率性和较高的可伸缩性,同时实验结果的统计分析显示仅有4%的空间点消耗最坏运行时间,对每个空间点的平均聚类时间约为0.033ms.

References

[1]  Lu Feng, Duan Ying-Ying, Yuan Wen. Data processing in location-based services. Communication of the China Computer Federation, 2010, 6(6): 38-44 (陆锋, 段滢滢, 袁文. LBS 的数据处理技术. 中国计算机学会通讯, 2010, 6(6): 38-44)
[2]  Guha S, Meyerson A, Mishra N, Motwani R, O'Callaghan L. Clustering data streams: theory and practice. IEEE Transactions on Knowledge and Data Engineering, 2003, 15(3): 515-528
[3]  Han J W, Kamber M. Data Mining Concepts and Techniques. Beijing: China Machine Press, 2006. 196-211
[4]  Ester M, Kriegel H P, Sander J, Xu X W. A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining. Portland, USA: AAAI Press, 1996. 226-231
[5]  Sander J, Ester M, Kriegel H P, Xu X W. Density-based clustering in spatial databases: the algorithm GDBSCAN and its applications. Data Mining and Knowledge Discovery, 1998, 2(2): 169-194
[6]  Hinneburg A, Keim D A. An efficient approach to clustering in large multimedia databases with noise. In: Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining. New York, USA: AAAI Press, 1998. 58-65
[7]  Ma Shuai, Wang Teng-Jiao, Tang Shi-Wei, Yang Dong-Qing, Gao Jun. A fast clustering algorithm based on reference and density. Journal of Software, 2003, 14(6): 1089-1095 (马帅, 王腾蛟, 唐世渭, 杨冬青, 高军. 一种基于参考点和密度的快速聚类算法. 软件学报, 2003, 14(6): 1089-1095)
[8]  Chen Zhuo, Meng Qing-Chun, Wei Zhen-Gang, Ren Li-Jie, Dou Jin-Feng. A fast clustering algorithm based on grid and density condensation point. Journal of Harbin Institute of Technology, 2005, 37(12): 1654-1657 (陈卓, 孟庆春, 魏振刚, 任丽婕, 窦金凤. 一种基于网格和密度凝聚点的快速聚类算法. 哈尔滨工业大学学报, 2005, 37(12): 1654-1657)
[9]  Duan L, Xiong D Y, Lee J, Feng G. A local density based spatial clustering algorithm with noise. In: Proceedings of the 2006 IEEE International Conference on Systems, Man, and Cybernetics. Taipei, China: IEEE, 2006. 4061-4066
[10]  Ester M, Kriegel H P, Sander J, Wimmer M, Xu X W. Incremental clustering for mining in a data warehousing environment. In: Proceedings of the 24th Very Large Data Bases (VLDB) Conference. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc, 1998. 323-333
[11]  Aggarwal C C, Han J W, Wang J Y, Yu P S. A framework for clustering evolving data streams. In: Proceedings of the 29th Very Large Data Bases (VLDB) Conference. Berlin, USA: VLDB Endowment, 2003. 81-92
[12]  Zhu Wei-Heng, Yin Jian, Xie Yi-Huang. Arbitrary shape cluster algorithm for clustering data stream. Journal of Software, 2006, 17(3): 379-387 (朱蔚恒, 印鉴, 谢益煌. 基于数据流的任意形状聚类算法. 软件学报, 2006, 17(3): 379-387)
[13]  Cao F, Ester M, Qian W N, Zhou A Y. Density-based clustering over an evolving data stream with noise. In: Proceedings of the 2006 SIAM Conference on Data Mining. Bethesda, USA: SIAM Press, 2006. 326-337
[14]  Ren J D, Ma R Q. Density-based data streams clustering over sliding windows. In: Proceedings of the 6th International Conference on Fuzzy systems and Knowledge Discovery. Piscataway, USA: IEEE Press, 2009. 248-252
[15]  Ruiz C, Menasalvas E, Spiliopoulou M. C-DenStream: using domain knowledge on a data stream. In: Proceedings of the 12th International Conference on Discovery Science. Berlin, Heidelberg: Springer-Verlag, 2009. 287-301
[16]  Chen Y X, Tu L. Density-based clustering for real-time stream data. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM, 2007. 133-142
[17]  Thomas Brinko. Network-based Generator of Moving Object [Online], available: http://iapg.jadehs.de/ personen/brinkho?/generator/, April 19, 2005
[18]  Lee J G, Han J W, Whang K Y. Trajectory clustering: a partition-and-group framework. In: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data. Beijing, China: ACM, 2007. 593-604
[19]  MacQueen J. Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability. Berkeley, USA: University of California Press, 1967. 281-297
[20]  Tu L, Chen Y X. Stream data clustering based on grid density and attraction. ACM Transactions on Knowledge Discovery from Data, 2009, 3(3): 1-27
[21]  Halkidi M, Vazirgiannis M. Clustering validity assessment using multi representatives. In: Proceedings of the 2nd Hellenic Conference on Artificial Intelligence (SETN), SETN 2002. Thessaloniki, New York: Springer, 2002. 237-248

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133