全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

维基百科的中文语义相关词获取及相关度分析计算

DOI: 10.13190/jbupt.200903.109.liy, PP. 109-112

Full-Text   Cite this paper   Add to My Lib

Abstract:

本文介绍了利用开放式百科全书维基百科获取语义关联词汇,并对语义相关程度进行分析和计算的方法。我们选择并整理了5万余篇维基百科中文语料,并利用超链接关系及词的互现等特征,获得了近40万对在概念或事实存在某种紧密语义关系的词,并简单分析了其聚类特性。进一步我们结合词在文档中的位置、频率等信息对语义相关程度进行了计算,并结合经典算法的相关结果,在不同语义相关度的集合上进行了对比实验,分析了本文获取语义关联词方法的有效性

References

[1]  Leacock C, Chodorow M. Combining local context and wordnet similarity for word sense identification//Fellbaum C. Wordnet: An Electronic Lexical Database. Princeton: MIT Press, 1998: 265 -283.
[2]  Remy M. Wikipedia: the free encyclopedia. online information review. Emerald Group Publishing Limited, 1999, 26(6): 434-435.
[3]  Ponzetto S P, Strube M. Deriving a large scale taxonomy from Wikipedia//Proceedings of the 22nd National Conference on Artificial Intelligence. Vancouver: AAAI Press, 2007: 1440-1445.
[4]  Zesch T, Gurevych I. Analysis of the Wikipedia category graph for NLP applications//Proceedings of the Text Graphs-2 Workshop(NAACL-HLT 2007). New York: Omnipress Inc, 2007: 1-8.
[5]  Wang Yang, Wang Haofen, Zhu Haiping, et al. Exploit semantic information for category annotation recommendation in Wikipedia// Natural Language Processing and Information Systems. Berlin: Springer, 2007: 48-60.
[6]  Banerjee S, Pedersen T. Extended gloss overlap as a measure of semantic relatedness//Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence. Acapulco. Mexico: Morgan Kaufmann Publishers Inc, 2003: 805-810.
[7]  Lesk M. Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone//Proceedings of the 5th Annual Conference on Systems Documentation. New York: ACM, 1986: 24-26.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133