%0 Journal Article
%T The Clustering Analysis Technology for Information Retrieval
信息检索中的聚类分析技术
%A Liu Yuan-chao
%A Wang Xiao-long
%A Liu Bing-quan
%A Zhong Bin-bin
%A
刘远超
%A 王晓龙
%A 刘秉权
%A 钟彬彬
%J 电子与信息学报
%D 2006
%I
%X The rapid development of Information Retrieval(IR) and search engine improves recall rate greatly, whereas the enhancement on both precision rate and information retrieval efficiency is not clear. The research on document clustering and multi-document keyword extraction will help solve this problem. The basic idea is to cluster part of the documents returned by search engine, and automatically extract some keywords for each cluster. Thus user can judge whether the documents in each cluster are relevant to his need. In this paper the concept of document relevancy and cluster relevancy are proposed, and both word frequency and the concept relevancy model of HOWNET are used to compute cluster relevancy, which is used to guide the merging process of clusters. The experimental results show that the IR efficiency has improved greatly.
%K Document clustering
%K Key words extraction
%K HOWNET
%K Document relevancy
文档聚类
%K 关键词抽取
%K 知网
%K 文档相关度
%U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=1319827C0C74AAE8D654BEA21B7F54D3&jid=EFC0377B03BD8D0EF4BBB548AC5F739A&aid=E2B9E46C48FAEBA2&yid=37904DC365DD7266&vid=D3E34374A0D77D7F&iid=E158A972A605785F&sid=6EEC242D0D8BE428&eid=5568599C60D4BE87&journal_id=1009-5896&journal_name=电子与信息学报&referenced_num=0&reference_num=10