%0 Journal Article
%T 基于阴影集的共享最邻近三支DBSCAN
Three-Way DBSCAN Text Clustering Based on Shadowed Sets and Shared Nearest Neighbor
%A 李志聪
%A 闫昆
%J Hans Journal of Data Mining
%P 137-150
%@ 2163-1468
%D 2025
%I Hans Publishing
%R 10.12677/hjdm.2025.152012
%X 传统DBSCAN算法在处理数据时,将某些不确定的数据强制划分到某一类中往往容易带来决策风险。针对此问题,提出了基于阴影集的共享最邻近三支DBSCAN算法。该算法利用三支决策思想,将核心点划分到核心域中,对于非核心点引入阴影集理论,计算样本的隶属度,将样本划分到核心域或边界域中,并通过共享最邻近算法进一步细化边界域中的样本划分,从而提升聚类的准确性和鲁棒性。该算法应用在文本分析领域,通过实验对比分析,验证了该算法具有较好的性能,提高了文本聚类的准确性。
The traditional DBSCAN algorithm, when processing data, often faces decision risks by forcing certain uncertain data points into a specific cluster. A three-way DBSCAN algorithm based on shadowed sets and Shared Nearest Neighbor is proposed to address this issue. This algorithm utilizes the three-way decision-making approach to classify core points into the core region. For non-core points, the theory of shadow sets is introduced to calculate the membership degree of the samples, categorizing them into either the core region or boundary region. The Shared Nearest Neighbor algorithm is then applied to further refine the classification of samples within the boundary region, thereby enhancing the accuracy and robustness of clustering. Applied in text analysis, experimental comparative analysis has verified that this algorithm demonstrates better performance and improves the accuracy of text clustering.
%K 三支决策,
%K 三支聚类,
%K 阴影集,
%K 文本聚类
Three-Way Decision
%K Three-Way Clustering
%K Shadowed Sets
%K Text Clustering
%U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=110731