%0 Journal Article %T 基于机器学习的K-Means聚类优化算法研究
Research on K-Means Clustering Optimization Algorithm Based on Machine Learning %A 李贞 %A 刘海燕 %A 刘策 %A 李庆钰 %A 刘刚 %J Hans Journal of Data Mining %P 20-26 %@ 2163-1468 %D 2022 %I Hans Publishing %R 10.12677/HJDM.2022.121003 %X K均值聚类(K-Means)算法是基于划分的聚类算法中的一个典型算法,是机器学习研究算法的基础。通过将相似的样本自动归到一个类别,合理地确定K值和K个初始类簇中心点,使聚类效果更好。经过适当的预处理,可以对数据做初步分析,甚至挖掘出隐含的价值信息。相比于SVM、GBDT等机器学习算法,具有操作简单、采用误差平方和准则函数、对大数据集处理上有较高的伸缩性和可压缩性的优点。但是,这种聚类算法仍然存在随机初始聚类中心导致算法不稳定、K值的选取不好把握、非凸性数据集非常难收敛等问题。为提升数据挖掘中聚类分析的效果,本文在分析数据挖掘、聚类分析、传统K-Means算法的基础上,提出一种改进的K-Means算法,经过实验证明,K-Means的改进算法可以有效地提高簇的质量,以及算法的效率和稳定性,使其提供更加精准有效的服务,并且减少了算法开销。
K-Means Clustering (K-Means) algorithm is a typical algorithm based on the clustering algorithm of division, which is the basis of the machine learning research algorithm. By automatically categoriz-ing similar samples into one category, the K value and K initial cluster center points can be deter-mined reasonably to make the clustering effect better. After proper pre-processing, the data can be analyzed and even the implied value information can be excavated. Compared with machine learn-ing algorithms such as SVM and GBDT, it has the advantages of simple operation, the use of error square and standard functions, and the high flexibility and compressibility of large data sets. How-ever, this clustering algorithm still has the problems such as random initial clustering center lead-ing to algorithm instability, poor grasp of K value selection and non-convex data set is very difficult to converge. In order to improve the effect of clustering analysis in data mining, this paper puts forward an improved K-Means algorithm on the basis of analyzing data mining, clustering analysis, and the traditional K-Means algorithm. Experiments have proved that the improved K-Means algo-rithm can effectively improve the quality of clusters as well as the efficiency and stability of the al-gorithm; and make it provide more accurate and effective service, and reduce the algorithm over-head. %K 改进K-Means算法,Mini Batch K-Means算法,数据挖掘
Improved K-Means Algorithm %K Mini Batch K-Means Algorithm %K Data Mining %U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=47753