All Title Author
Keywords Abstract

Publish in OALib Journal
ISSN: 2333-9721
APC: Only $99


Relative Articles


Research on K-Means Clustering Optimization Algorithm Based on Machine Learning

DOI: 10.12677/HJDM.2022.121003, PP. 20-26

Keywords: 改进K-Means算法,Mini Batch K-Means算法,数据挖掘
Improved K-Means Algorithm
, Mini Batch K-Means Algorithm, Data Mining

Full-Text   Cite this paper   Add to My Lib


K-Means Clustering (K-Means) algorithm is a typical algorithm based on the clustering algorithm of division, which is the basis of the machine learning research algorithm. By automatically categoriz-ing similar samples into one category, the K value and K initial cluster center points can be deter-mined reasonably to make the clustering effect better. After proper pre-processing, the data can be analyzed and even the implied value information can be excavated. Compared with machine learn-ing algorithms such as SVM and GBDT, it has the advantages of simple operation, the use of error square and standard functions, and the high flexibility and compressibility of large data sets. How-ever, this clustering algorithm still has the problems such as random initial clustering center lead-ing to algorithm instability, poor grasp of K value selection and non-convex data set is very difficult to converge. In order to improve the effect of clustering analysis in data mining, this paper puts forward an improved K-Means algorithm on the basis of analyzing data mining, clustering analysis, and the traditional K-Means algorithm. Experiments have proved that the improved K-Means algo-rithm can effectively improve the quality of clusters as well as the efficiency and stability of the al-gorithm; and make it provide more accurate and effective service, and reduce the algorithm over-head.


[1]  钱鑫, 张龙波, 田爱奎, 邓齐志, 汪金苗. 一种面向数据密集型计算环境的聚类算法[J]. 济南大学学报(自然科学版), 2013(1): 11-15.
[2]  Idrees, A.K., Al-Qurabat, A., Jaoude, C.A., et al. (2019) Integrated Divide and Conquer with Enhanced K-Means Technique for Energy-Saving Data Aggregation in Wireless Sensor Networks., The 15th Interna-tional Wireless Communications & Mobile Computing Conference (IWCMC 2019), 2019, 973-978.
[3]  夏长辉. 一种改进的K-Means聚类算法[J]. 信息与电脑, 2017(14): 40-42.
[4]  钮永莉, 武斌. 基于改进粒子群和K-Means的文本聚类算法研究[J]. 兰州文理学院学报(自然科学版), 2019, 33(4): 44-47.
[5]  杨丹, 朱世玲, 卞正宇. 基于改进的K-Means算法在文本挖掘中的应用[J]. 计算机技术与发展, 2019, 29(4): 68-71.
[6]  王康. K-Means聚类算法的改进研究及其应用[D]: [硕士学位论文]. 大连: 大连理工大学, 2015.
[7]  Nayak, S., Panda, C., Xalxo, Z., et al. (2015) An Integrated Clustering Framework Using Optimized K-Means with Firefly and Canopies. Computational Intelligence in Data Mining, 2, 333-343.
[8]  Yin, J.W., Chen, J.M., Xue, B.L., et al. (2013) An Enhancing K-Means Algorithm Based on Sorting and Partition. International Journal of Database Theory and Application, 22, 387-408.
[9]  Whang, Y. and Cui, P. (2017) An Efficient K-Means Parallel Algorithm Based on MapReduce. Journal of Liaoning Technical University (Natural Science Edition), 36, 1204-1211.
[10]  韩存鸽, 刘长勇. 一种改进的K-Means算法[J]. 闽江学院学报, 2019, 40(5): 49-54+90.
[11]  韩琮师, 张高毓, 张熙, 等. 基于改进的K-Means算法在套餐精准营销中的研究[J]. 信息技术与信息化, 2021(5): 132-133.
[12]  刘文佳, 张骏. 一种改进的K-Means聚类算法[J]. 现代商贸工业, 2018(19): 196-198.


comments powered by Disqus

Contact Us


WhatsApp +8615387084133

WeChat 1538708413