%0 Journal Article %T Tri-Training and Data Editing Based Semi-Supervised Clustering Algorithm
基于Tri-Training和数据剪辑的半监督聚类算法 %A DENG Chao %A GUO Mao-Zu %A
邓超 %A 郭茂祖 %J 软件学报 %D 2008 %I %X In this paper, a algorithm named DE-Tri-training semi-supervised K-means is proposed, which could get a seeds set of larger scale and less noise. In detail, prior to using the seeds set to initialize cluster centroids, the training process of a semi-supervised classification approach named Tri-training is used to label unlabeled data and add them into the initial seeds set to enlarge the scale. Meanwhile, to improve the quality of the enlarged seeds set, a nearest neighbor rule based data editing technique named Depuration is introduced into Tri-training process to eliminate and correct the mislabeled noise data in the enlarged seeds. Experimental results show that the novel semi-supervised clustering algorithm could effectively improve the cluster centroids initialization and enhance clustering performance. %K semi-supervised clustering %K semi-supervised classification %K K-means %K seeds set %K Tri-training %K depuration data editing
半监督聚类 %K 半监督分类 %K K-均值 %K seeds集 %K Tri-Training %K Depuration数据剪辑 %U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=8240383F08CE46C8B05036380D75B607&jid=7735F413D429542E610B3D6AC0D5EC59&aid=B9547C6A60D120A0941E4C8B2EB71AFD&yid=67289AFF6305E306&vid=2A8D03AD8076A2E3&iid=38B194292C032A66&sid=46FF101E7ECF9F15&eid=B28C697BC3A1BA62&journal_id=1000-9825&journal_name=软件学报&referenced_num=0&reference_num=23