%0 Journal Article
%T 基于均值漂移的三支聚类算法
Three-Way Clustering Algorithm Based on Mean Shift
%A 吕军豪
%A 徐丁
%A 孙波
%A 杨晶
%J Pure Mathematics
%P 3401-3411
%@ 2160-7605
%D 2023
%I Hans Publishing
%R 10.12677/PM.2023.1312353
%X 本文结合基于均值漂移的聚类算法与三支决策理论,首先利用核函数对中心点至样本点的向量进行加权求和,定义了偏移向量,据此不断移动中心点的位置,使样本中心点在密度梯度方向移动至密度最大的区域。然后根据样本点对类簇的访问频率将数据分为非噪声点和噪声点数据,对非噪声点数据采取传统的二支聚类得到核心域,对噪声点数据采取三支聚类,通过比较样本点对不同类簇的访问频率将样本点划分到相应类簇的边界域。将聚类结果用核心域和边界域表示。通过UCI数据集上的实验结果,验证了本文提出的算法相对于传统聚类可以提高聚类准确度、聚类结构的类内紧密度和类间分离度。
By combining the clustering algorithm based on mean shift with the theory of three-way decision theory, this paper defines the mean shift vector according to the vector from the center point to the sample points, so that the center point of the samples is moved in the direction of the density gradient to the region of the highest density. According to the access frequency of the sample points to the class clusters the data are divided into non-noise point and noise point data, the traditional two-way clustering is taken to obtain the core domain for the non-noise point data, and the three-way clustering is taken for the noise point data, and the sample points are divided into the boundary domains of the corresponding class clusters by comparing the access frequency of the sample points to the different class clusters. The clustering results were expressed in terms of core and boundary domains. The experimental results on the UCI dataset verify the advantages of the proposed algorithm over traditional clustering algorithms, which can improve the clustering accuracy, the intra-class closeness of the clustering structure and the inter-class separation.
%K 三支聚类,均值漂移,偏移向量
Three-Way Clustering
%K Mean Drift
%K Offset Vector
%U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=77507