%0 Journal Article
%T Similarity computing of documents based on VSM
基于VSM的文本相似度计算的研究*
%A GUO Qing-lin
%A LI Yan-mei
%A TANG Qi
%A
郭庆琳
%A 李艳梅
%A 唐琦
%J 计算机应用研究
%D 2008
%I
%X The precision and efficiency of the computing of documents similarity is the foundation and key of other documents process.This paper improved the DF and TF-IDF arithmetic.In this way,DF's time complexity was linearity that suited the mass documents process,and could make up the fault that exceptional useful characters might be deleted.Also,it did a mend on the TF-IDF arithmetic to improve the precision of documents similarity.
%K documents similarity
%K feature selection
%K TF-IDF( term frequency-inverse document frequency)
%K VSM( vector space model)
文本相似度
%K 特征选择
%K 词频—逆文档频率法
%K 向量空间模型
%U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=8240383F08CE46C8B05036380D75B607&jid=A9D9BE08CDC44144BE8B5685705D3AED&aid=44908182727DCD15E3CA182E3ED2A46F&yid=67289AFF6305E306&vid=C5154311167311FE&iid=708DD6B15D2464E8&sid=B6B3C150E7B1D878&eid=078A215E74931DA7&journal_id=1001-3695&journal_name=计算机应用研究&referenced_num=9&reference_num=7