%0 Journal Article
%T New text categorization method based on the frequency of topic words
基于主题词频数特征的文本主题划分
%A KANG Kai
%A LIN Kun-hui
%A ZHOU Chang-le
%A
康恺
%A 林坤辉
%A 周昌乐
%J 计算机应用
%D 2006
%I
%X The word frequency matrix currently used in text categorization is characterized with high dimensionality and excessive sparsity.These two features caused some difficulties to computing.To solve this problem,according to the search engine users' selections,a new text categorization method based upon the feature of topic words frequency was proposed.This approach was designed to filter new concept topic words by statistical method,and then the FCM clustering algorism was applied to the documents,using the frequency of topic words rather than the frequency of single word as the feature.This method performs well in the experiment.Furthermore,this method was compared in many aspects with a text categorization method based on clusters,and some useful conclusions about implementation and application were reached.
%K search engine
%K document clustering
%K Fuzzy C-Means(FCM)
%K topic word filtering
搜索引擎
%K 文本聚类
%K 模糊C-均值
%K 主题词筛选
%U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=8240383F08CE46C8B05036380D75B607&jid=831E194C147C78FAAFCC50BC7ADD1732&aid=44F5A96C5E1E5EE6&yid=37904DC365DD7266&vid=96C778EE049EE47D&iid=5D311CA918CA9A03&sid=D418FDC97F7C2EBA&eid=BBCD5003575B2B5F&journal_id=1001-9081&journal_name=计算机应用&referenced_num=5&reference_num=7