全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Novel semi-supervised clustering algorithm based on active data selection
基于主动数据选取的半监督聚类算法

Keywords: data mining,semi-supervised clustering,active learning,labeled data,data selection,minimum spanning tree,multi-density dataset,unbalanced dataset
数据挖掘
,半监督聚类,主动学习,标签数据,数据选取,最小生成树,多密度数据集,不平衡数据集

Full-Text   Cite this paper   Add to My Lib

Abstract:

Semi-supervised clustering, which aims to significantly improve the clustering results using limited supervision, has inevitably been the research focus in data mining and machine learning in recent years. But the accuracy of existing semi-clustering algorithms is low when dealing with the datasets with little labeled data or the multi-density and unbalanced datasets. Based on the active learning, this paper studied the data selection and presented a novel semi-supervised clustering algorithm. It selected information-rich data as labeled data by combining the ideas of minimum spanning tree clustering and active lear-ning, and then used the KNN-like technology to propagate labels. Evaluating on several UCI standard datasets and synthetic datasets, the results show that the proposed method has manifest higher accuracy and stable performance in comparison with others, even when the datasets are multi-density and unbalanced.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133