%0 Journal Article
%T Quality Evaluation for Three Textual Document Clustering Algorithms
文本聚类算法的质量评价
%A LIU Wu-Hua
%A LUO Tie-Jian
%A WANG Wen-Jie
%A
刘务华
%A 罗铁坚
%A 王文杰
%J 中国科学院研究生院学报
%D 2006
%I
%X Textual document clustering is one of the effective approaches to establish a classification instance of a huge textual document set. Clustering Validation or Quality Evaluation techniques can be used to assess the efficiency and effectiveness of a clustering algorithm. This paper presents the quality evaluation criterions. Based on these criterions we take three typical textual document clustering algorithms for assessment with experiments. The comparison results show that STC(Suffix Tree Clustering) algorithm is better than k-Means and Ant-Based clustering algorithms. The better performance of STC algorithm comes from that it takes into account the linguistic property when processing the documents. Ant-Based clustering algorithm's performance variation is affected by the input variables. It is necessary to adopt linguistic properties to improve the Ant-Based text clustering's performance.
%K textual document clustering
%K quality evaluation
%K clustering validation
%K STC
%K Ant-Based clustering
%K k-Means clustering
文本聚类
%K 质量评价
%K 有效性验证
%K 后缀树聚类
%K Ant-Based聚类
%K k-Means聚类
%U http://www.alljournals.cn/get_abstract_url.aspx?pcid=B5EDD921F3D863E289B22F36E70174A7007B5F5E43D63598017D41BB67247657&cid=B47B31F6349F979B&jid=67CDFDECD959936E166E0F72DE972847&aid=C2E3E117D4E580E5&yid=37904DC365DD7266&vid=EA389574707BDED3&iid=94C357A881DFC066&sid=DB7B2C790D19BE6E&eid=6C62BFE34266FA92&journal_id=1002-1175&journal_name=中国科学院研究生院学报&referenced_num=1&reference_num=16