%0 Journal Article
%T Two Spectral Algorithms for Ensembling Document Clusters
解决文本聚类集成问题的两个谱算法
%A XU Sen LU Zhi-Mao GU Guo-ChangCollege of Computer Science
%A Technology
%A Harbin Engineering University
%A Harbin College of Information
%A Communication Engineering
%A Harbin
%A
徐森
%A 卢志茂
%A 顾国昌
%J 自动化学报
%D 2009
%I
%X A critical problem in cluster ensemble is how to combine multiple clusters to yield a superior result. In this paper, the idea of spectral clustering algorithm is brought into the document cluster ensemble problem. Since spectral clustering algorithm needs to solve eigenvalue decomposition problem of a large scale matrix to get the low dimensional embedding of documents for later clustering, a fast spectral algorithm is first proposed, in which the large scale matrix eigenvalue decomposition problem is transformed to an equivalent singular value decomposition problem and then to a much smaller matrix eigenvalue decomposition problem. The characteristic of spectral clustering algorithm is further investigated and another spectral algorithm is proposed, in which the low dimensional embedding of documents are obtained indirectly by those of hyperedges. Experiments on TREC and Reuters document sets show that both proposed spectral algorithms outperform other cluster ensemble techniques based on graph partitioning, and can effectively solve document cluster ensemble problem.
%K Clustering analysis
%K cluster ensemble
%K spectral clustering
%K document clustering
聚类分析
%K 聚类集成
%K 谱聚类
%K 文本聚类
%U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=8240383F08CE46C8B05036380D75B607&jid=E76622685B64B2AA896A7F777B64EB3A&aid=418B9ED3C50621F46039D4BEC803CFA0&yid=DE12191FBD62783C&vid=6209D9E8050195F5&iid=DF92D298D3FF1E6E&sid=71EC92B56215521C&eid=15A3E3A739C4EF3F&journal_id=0254-4156&journal_name=自动化学报&referenced_num=5&reference_num=0