全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

A Context Tree Kernel Based on Latent Semantic Topic
一种面向隐含主题的上下文树核

Keywords: Text clustering,Context tree kernel,Statistical language models,Latent Dirichlet Allocation (LDA)
文本聚类
,上下文树核,统计语言模型,隐含狄利克雷分配(LDA)

Full-Text   Cite this paper   Add to My Lib

Abstract:

The lack of semantic information is a critical problem of context tree kernel in text representation. A context tree kernel method based on latent topics is proposed. First, words are mapped to latent topic space through Latent Dirichlet Allocation(LDA). Then, context tree models are built using latent topics. Finally, context tree kernel for text is defined through mutual information between the models. In this approach, document generative models are defined using semantic class instead of words, and the issue of statistic data sparse is solved. The clustering experiment results on text data set show, the proposed context tree kernel is a better measure of topic similarity between documents, and the performance of text clustering is greatly improved.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133