|
基于XGBoost算法的混合模型在零售业数据中的应用
|
Abstract:
本文使用德国零售数据,经过观察数据的特征,展开数据预处理,建立特征工程;通过将聚类分析与XGBoost模型相结合,建立混合预测模型对德国零售业数据进行建模预测。该模型首先对特征集降维之后选择最优的聚类数,然后对聚类分析后不同的类别分别进行XGBoost模型训练,最后将通过加权求和得到预测结果。研究结果表明,相比较于其他模型,混合模型提升了预测精度和泛化能力。
The paper uses German retail sales data, after observing the characteristics of the data, carries out data preprocessing, and establishes feature engineering; by combining cluster analysis with XGBoost model, a hybrid prediction model is established to model and forecast German retail sales data. Firstly, the model selects the optimal cluster number after dimensionality reduction of feature set, then performs XGBoost model training for different categories after cluster analysis, and finally obtains prediction results through weighted summation. The studies show that the hybrid model improves the prediction accuracy and generalization ability compared with other models.
[1] | Chang, P.-C., Wang, Y.-W. and Liu, C.-H. (2007) The Development of a Weighted Evolving Fuzzy Neural Network for PCB Sales Forecasting. Expert Systems with Applications, 32, 86-96. https://doi.org/10.1016/j.eswa.2005.11.021 |
[2] | 陈宇科. 商品销量的趋势分析及预测[J]. 渝西学院学报(自然科学版), 2003(2): 59-61. |
[3] | Wu, L., Yan, J.Y., Fan, Y.J. (2012) Data Mining Algorithms and Statistical Analysis for Sales Data Forecast. 2012 Fifth International Joint Conference on Computational Sciences and Optimization, Harbin, 23-26 June 2012, 577-581.
https://doi.org/10.1109/CSO.2012.132 |
[4] | 牟书成. 面向零售业时间序列预测与分析的算法研究[D]: [硕士学位论文]. 北京: 北京邮电大学, 2019 |
[5] | 杜小芳, 张金隆. 农产品销量预测的支持向量机方法[J]. 中国管理科学, 2005(4): 129-134. |
[6] | 武牧, 等. 一种基于支持向量机的卷烟销量预测方法[J]. 烟草科技, 2016, 49(2): 87-91. |
[7] | Qin, Y.Q. and Li, H.M. (2011) Sales Forecast Based on BP Neural Network. 2011 IEEE 3rd International Conference on Communication Software and Networks, Xi’an, 27-29 May 2011, 186-189.
https://doi.org/10.1109/ICCSN.2011.6014419 |
[8] | 马超群, 王晓峰. 基于LSTM网络模型的菜品销量预测[J]. 现代计算机(专业版), 2018(23): 26-30. |
[9] | 张凌波, 刘海. 基于IF0A-SVR的断路器销量预测[J]. 控制与决策, 2019, 34(12): 2667-2672. |