全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

统计流形上基于核近邻算法的文本分类研究

Keywords: 扩散核,核近邻,狄利克雷混合多项式,文本分类

Full-Text   Cite this paper   Add to My Lib

Abstract:

为了更加高效地对文本数据进行描述,提出将文本向量表示为统计流形上的点,并用核方法将文本的生成模型和判别模型结合起来.用DCM统计流形上扩散核来表示文本空间上的距离度量,提出DCM流形上的核近邻算法用于文本分类.实验结果表明,在两个实验语料库上基于DCM流形的核近邻算法的准确率和召回率优于对比算法或与对比算法相当.

References

[1]  Zhang D,Chen X,Lee W S,Text classification with kernels on the multinomial manifold,Salvador,Brazil:ACM Press,2005.
[2]  Church K W,Gale W,A poisson mixtures,Natural Language Engineering,1995(2).
[3]  苏金树.张博锋.徐昕 基于机器学习的文本分类技术研究进展 [J].-软件学报2006(9)
[4]  Kondor R,Lafferty J,Diffusion kernels on graphs and other discrete input spaces,San Mateo,CA,USA:Morgan Kaufmann Press,2002.
[5]  Lafferty J,Lebanon G,Diffusion kernels on statistical manifolds,Journal of Machine Learning Research,2004.
[6]  Madsen R E,Kauchak D,Elkan C,Modeling word burstiness using the Dirichlet distribution,New York,USA:Morgan Kaufmann Press,2005.
[7]  Yu K,Ji L,Zhang X,Kernel nearest-neighbor algorithm,Neural Processing Letters,2002.
[8]  Aitchison J,The statistical analysis of compositional data,London:Chapman and Hall,1986.
[9]  Minka T,Estimating a Dirichlet distribution,http:∥research.microsoft.com/~minka,[200.
[10]  Amari S,Nagaoka H,Methods of information geometry,Oxford:Oxford University Press,2000.
[11]  Lebanon G,Metric learning for text documents,IEEE Transactions on Pattern Analysis and Machine Intelligence,2006.
[12]  Peng J,Heisterkamp D R,Dai H K,Adaptive quasiconformal kernel nearest neighbor classification,IEEE Transactions on Pattern Analysis and Machine Intelligence,2004(5).

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133