全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

使用证据累积的文本聚类谱算法

Keywords: 聚类分析, 文本聚类, 聚类谱, 证据累积, 超球K均值??

Full-Text   Cite this paper   Add to My Lib

Abstract:

针对谱聚类算法相似度函数设置困难问题,提出了一种使用证据累积的文本聚类谱算法.该算法使用超球K均值算法对文本集进行多次聚类,并将每次得到的划分结果作为判断2个文本是否应该放在一个簇中的证据,由此构建文本的相似度矩阵和正则化拉普拉斯矩阵.在TREC和Reuters文本集上进行了实验,验证了本文算法的有效性,它比层次聚类算法和CLUTO提供的K均值算法更加优越.

References

[1]  FRED A L, JAIN A K. Combining multiple clusterings ?using? evidence accumulation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(6): 835?850.
[2]  DHILLON I S, MODHA D S. Concept decompositions for large sparse text data using clustering [J]. Machine Learning, 2001, 42: 143?175.
[3]  LUXBURG U V, BELKIN M, BOUSQUET O. Consistency of spectral clustering [J]. The Annals of Statistics, 2008, 36(2): 555?586.?
[4]  TREHL A, GHOSH J. Cluster ensembles―a knowledge reuse framework for combining partitionings [J]. Journal of Machine Learning Research, 2002, 3: 583?617.
[5]  TREC. Text retrieval conference[EB/OL].[2007?11?28]. http://trec.nist.gov.
[6]  LEWIS D D. Reuters?21578 text categorization test collection distribution 1.0 [EB/OL].[2007?11?28]. http://www.research.att.com/~lewis.
[7]  TAN P N, STEINBACH M, KUMAR V. Introduction to data mining [M]. MA: Addison?Wesley Longman, 2005: 487?647.[2]徐森, 卢志茂, 顾国昌. 解决文本聚类集成问题的两个谱算法 [J]. 自动化学报, 2009, 35(7): 997?1002.
[8]  ?XU Sen, LU Zhimao, GU Guochang. Two spectral algorithms for ensembling document clusters [J]. Acta Automatica Sinica, 2009, 35(7): 997?1002.
[9]  LUXBURG U V. A tutorial on spectral clustering [J]. Statistics and Computing, 2007, 17(4): 395?416.
[10]  HAGEN L, KAHNG A B. New spectral methods for ratio cut partitioning and clustering [J]. IEEE Transactions on Computer?Aided Design, 1992, 11(9): 1074?1085.
[11]  SHI J, MALIK J. Normalized cuts and image segmentation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(8): 888?905.
[12]  NG A Y, JORDAN M I, WEISS Y. On spectral clustering: analysis and an algorithm [C]// Advances in Neural Information Processing Systems. Vancouver, Canada, 2001.
[13]  MEILA M, SHI J. A random walks view of spectral segmentation [C]// The 8th International Workshop on Artificial Intelligence and Statistics. Key West, USA, 200
[14]  王玲, 薄列峰, 焦李成. 密度敏感的谱聚类 [J]. 电子学报, 2007, 35(8): 1577?1581.
[15]  ?WANG Ling, BO Liefeng, JIAO Licheng. Density?sensitive spectral clustering [J]. Acta Electronica Sinica, 2007, 35(8): 1577?1581.?
[16]  FRED A, JAIN A K. Data clustering using evidence accumulation [C]// Proceedings of the 16th International Conference on Pattern Recognition, Quebec City, Canada, 2002:276?280.?

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133