|
福州大学学报(自然科学版) 2016
基于灰关联分析的模糊C均值算法
|
Abstract:
标准的模糊C均值算法(FCM)采用欧式距离测度,均等地利用所有特征来计算数据间的相似性,但其存在受局部特征影响、对非球状簇识别效果不佳、无法适应高维数据等缺点. 为此,提出一种将基于差异信息理论的灰关联分析结合到FCM中的新算法,利用均衡接近度描述数据间的相似性,强调从整体上判断数据的相似程度,减弱局部特征高关联性的影响,能够适应不同形状簇的识别. 在人工和真实数据集上的实验表明,所提出的新算法具有更高的聚类精度和更好的稳定性.
Fuzzy C means algorithm using the Euclidean distance measure uses all the characteristics to calculate the similarities between the data points equably,which has disadvantages of the local characteristics' impact and poor identification of aspherical clusters and poor adaptability of high-dimensional data points,etc. In this paper,a novel algorithm that integrated the grey relational analysis based on differential information theory with the FCM algorithm was proposed. The similarities between the data points were described by the balanced closeness degree,which emphasized on judging similarities on the whole and attenuated the impact of the high similarities of the local characteristics and adapted to identify clusters of different shapes. The experimental results on the artificial and real-world datasets demonstrate that the proposed algorithm achieves both high clustering accuracy and good stability