全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

统计流形扩散核的文本分类方法

, PP. 339-345

Keywords: 统计流形,扩散核,Dirichlet分布,文本分类

Full-Text   Cite this paper   Add to My Lib

Abstract:

提出Dirichlet混合多项式(DCM)流形,并利用DCM流形可与正半球流形建立同胚和等距关系的性质,通过拉回映射将正半球流形的测地距离映射为DCM流形的测地距离,从而在DCM流形上建立距离度量,构建统计流形上的Dirichlet混合多项式扩散核和Dirichlet混合多项式倒排文档频率(DCMIDF)扩散核。利用WebKBTop4和20Newsgroups语料库上进行实验,DCM流形能比欧氏空间更能准确地描述文本。与多项式核支持向量机算法、,负测地距离核支持向量机算法相比,实验结果显示文中基于DCM扩散核和DCMIDF扩散核的支持向量机算法可取得良好的文本分类效果。

References

[1]  Rao C R.Information and Accuracy Attainable in the Estimation of Statistical Parameters.Bulletin of the Calcutta Mathematical Society,1945,37: 81-91
[2]  encov N N.Statistical Decision Rules and Optimal Inference.Providence,USA: American Mathematical Society,1982: 477-493
[3]  Campbell L L.An Extended C∨encov Characterization of the Information Metric.Proc of the American Mathematical Society,1986,98(1): 135-141
[4]  Jaakkola T S,Haussler D.Exploiting Generative Models in Discriminative Classifiers // Kearns M S,Solla S A,Cohn D A,eds.Advances in Neural Information Processing Systems.Cambridge,USA: MIT Press,1999,XI: 487-493
[5]  Jebara T,Kondor R,Honward A.Probability Product Kernels.Journal of Machine Learning Research,2004,5: 819-844
[6]  Kondor R I,Lafferty J D.Diffusion Kernels on Graphs and Other Discrete Input Spaces // Proc of the 19th International Conference on Machine Learning.Edinburgh,Scotland,2002: 315-322
[7]  Lafferty J,Lebanon G.Diffusion Kernels on Statistical Manifolds.Journal of Machine Learning Research,2004,6: 129-163
[8]  Zhang D,Chen Xi,LEE W S.Text Classification with Kernels on the Multinomial Manifold // Proc of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.Salvador,Brazil,2005: 266-273
[9]  Dollar P,Rabaud V,Belongie S.Non-Isometric Manifold Learning: Analysis and an Algorithm // Proc of the 24th International Conference on Machine Learning,Corvallis,USA,2007: 241-248
[10]  Lin Tong,Zha Hongbin.Riemannian Manifold Learning.IEEE Trans on Pattern Analysis and Machine Intelligence,2008,30(5): 796-809
[11]  Zhou Shibin.Text Categorization Based on Generative and Discriminative Model.Ph.D Dissertation.Beijing,China: Beijing Institute of Technology,2009 (in Chinese)(周世斌.基于生成模型和判别模型的文本分类技术研究.博士学位论文.北京:北京理工大学,2009)
[12]  Madsen R E,Kauchak D,Elkan C.Modeling Word Burstiness Using the Dirichlet Distribution // Proc of the 22nd International Conference on Machine Learning.Bonn,Germany,2005: 545-552
[13]  Belkin M.Problems of Learning on Manifolds.Ph.D Dissertation.Chicago,USA: University of Chicago,2003
[14]  Li Kan,Liu Yushu.Research on Noise Insensitive SVM Based Multi-Class Classification // Proc of International Conference on Machine Learning and Cybernetics.Shanghai,China,2004: 3234-3237

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133