全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

一种优化的顺序IB文本聚类算法*

, PP. 417-422

Keywords: 文本聚类,信息瓶颈理论,模拟退火,基于模拟退火的迭代顺序IB(SA-isIB)算法

Full-Text   Cite this paper   Add to My Lib

Abstract:

针对顺序IB(sIB)算法在文本聚类上存在的诸如易陷入局部优解、效率较低等问题,基于模拟退火方法,提出一种优化的顺序文本聚类算法(SA-isIB).该算法根据一个合理的退火序列,从基本sIB算法产生的初始聚类结果中随机选取一定比例的文本,对其类标记进行随机修改并重新对解进行优化,经过退火过程后,得到比sIB算法精度更高的文本聚类结果.文本数据集上的实验结果表明,SA-isIB能有效提高sIB算法用于文本聚类的精度.

References

[1]  Tishby N, Pereira F, Bialek W. The Information Bottleneck Method // Proc of the 37th Annual Allerton Conference on Communication, Control and Computing. Illinois, USA, 1999: 368-377
[2]  Slonim N, Friedman N, Tishby N. Unsupervised Document Classification Using Sequential Information Maximization // Proc of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Tampere, Finland, 2002: 129-136
[3]  Goldberger J, Gordon S, Greenspan H. Unsupervised Image-Set Clustering Using an Information Theoretic Framework. IEEE Trans on Image Processing, 2006, 15(2): 449-458
[4]  Slonim N, Somerville R, Tishby N, et al. Objective Classification of Galaxy Spectra Using the Information Bottleneck Method. Monthly Notices of the Royal Astronomical Society, 2001, 323(2): 270-284
[5]  Tishby N, Slonim N. Data Clustering by Markovian Relaxation and the Information Bottleneck Method // Proc of the 13th Annual Conference on Neural Information Processing Systems. Colorado, USA, 2001: 640-646
[6]  Schneidman E, Bialek W, Berry M J. An Information Theoretic Approach to the Functional Classification of Neurons // Proc of the 15th Annual Conference on Neural Information Processing Systems. Vancouver, Canada, 2002: 197-204
[7]  Gorodetsky M. Methods for Discovering Semantic Relations between Words Based on Co-Occurrence Patterns in Corpora. Masters Dissertation. Jerusalem, Palestine: Hebrew University. School of Computer Science and Engineering, 2002
[8]  Slonim N. The Information Bottleneck: Theory and Application. Ph.D Dissertation. Jerusalem, Palestine: Hebrew University. School of Computer Science and Engineering, 2002
[9]  Chechik G, Tishby N. Extracting Relevant Structures with Side Information // Proc of the 16th Annual Conference on Neural Information Processing Systems. Vancouver, Canada, 2002: 857-864
[10]  Gondek D, Hofmann T. Non-Redundant Data Clustering // Proc of the 4th IEEE International Conference on Data Mining. Brighton, UK, 2004: 75-82
[11]  Elidan G, Friedman N. Learning Hidden Variable Networks: The Information Bottleneck Approach. Journal of Machine Learning Research, 2005, 6(1): 81-127
[12]  Chechik G, Globerson A, Tishby N, et al. Information Bottleneck for Gaussian Variables. Journal of Machine Learning Research, 2005, 6(1): 165-188

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133