全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

基于网络搜索数据的北京市旅游需求预测
Tourism Demand Forecasting Based on Internet Search Data in Beijing

DOI: 10.12677/HJDM.2022.122015, PP. 133-151

Keywords: 互联网 + 旅游,网络搜索数据,Adaptive Lasso,支持向量回归,旅游需求预测
Internet + Tourism
, Internet Search Data, Adaptive Lasso, Support Vector Regression,Tourism Demand Forecasting

Full-Text   Cite this paper   Add to My Lib

Abstract:

我国旅游业经过40年的高速度发展,现在进入了高质量发展新阶段。同时,随着疫情防控进入常态化和旅游市场逐步回暖,“互联网 + 旅游”新业态发展迅猛,海量网络搜索数据潜在反映着人们的旅游需求。因此,本文利用网络搜索数据(Internet search data, IS)用于北京市旅游需求预测。首先,利用Python爬取在线旅游网站的游记攻略,使用NLPIR分词系统提取高频词汇,并结合旅游六要素确定初始关键词词库。其次,采用需求图谱、百度指数相关词热度推荐、北京旅游网推荐等7种方法拓展关键词,经过Adaptive Lasso等方法筛选得到9个最佳预测变量,并引入季节性虚拟变量,然后结合网络搜索关键词和随机森林算法、极限梯度提升算法及支持向量回归算法对北京市旅游需求进行建模和训练。最后,借助多个预测性能指标,确定支持向量回归模型为最优模型。研究结果表明:网络搜索数据与旅游需求显著相关,具有很强的时效性,并且支持向量回归模型能够很好地解决突发事件和小样本问题,用于短期旅游需求预测是高效可行的。
After 40 years of rapid development, China’s tourism industry has entered a new stage of high-quality development. Meanwhile, with the gradual normalization of epidemic prevention and control and the gradual warming of the tourism market, the new format of “Internet + tourism” is developing rapidly, and massive Internet search data potentially reflects the tourism demand of people. Therefore, this paper attempts to apply Internet search data to the tourism demand forecast of Beijing. Firstly, Python is used to crawl the travel notes of online travel websites, NLPIR word segmentation system is used to extract high-frequency words, and six elements of tourism are combined to determine the initial keyword thesaurus. Secondly, seven methods, such as demand map, related word heat recommendation from Baidu index and recommendation from Beijing travel website, etc., are used to expand keywords. Nine predictive variables are selected by adaptive lasso and other methods, the seasonal dummy variables are introduced, then RF algorithm, XGBoost algorithm and SVR algorithm are combined to model and train the tourism demand of Beijing. Finally, the support vector regression model is determined as the optimal model with the help of multiple prediction performance indicators. The results show that there is a significant correlation between Internet search data and tourism demand, and Internet search data has strong timeliness. In addition, SVR model can well solve the emergency and small sample problems, and it is efficient and feasible to predict short-term tourism demand.

References

[1]  中国互联网络信息中心. 第47次中国互联网发展状况统计报告[R/OL].
[2]  http://cnnic.cn/hlwfzyj/hlwxzbg/hlwtjbg/202102/P020210203334633480104.pdf, 2021-02-11.
[3]  清华大学中国经济思想与实践研究院(ACCEPT)宏观预测课题组, 李稻葵. 中国宏观经济形势分析与未来取向[J]. 改革, 2021(1): 1-17.
[4]  北京市人民政府. 北京市推进全国文化中心建设中长期规划(2019年-2035年)[EB/OL].
[5]  http://www.beijing.gov.cn/zhengce/zhengcefagui/202004/t20200409_1798426.html, 2020-04-12.
[6]  中国互联网信息中心. 2019年中国网民搜索行为调查报告[R/OL].
[7]  http://www.cnnic.net.cn/hlwfzyj/hlwxzbg/ssbg/201910/P020191025506904765613.pdf, 2021-02-11.
[8]  黄蓉. 中国城镇居民的国内旅游需求研究[D]: [博士学位论文]. 武汉: 华中科技大学, 2015.
[9]  彭赓, 刘金烜, 曾鹏志, 李晓炫. 时间序列相似性与基于搜索数据的预测研究——以九寨沟客流量预测为例[J]. 管理现代化, 2016, 36(2): 107-110.
[10]  张倩. 基于随机森林回归模型的住房租金预测模型的研究[D]: [硕士学位论文]. 长春: 东北师范大学, 2019.
[11]  龚洪亮. 基于XGBoost算法的武汉市二手房价格预测模型的实证研究[D]: [硕士学位论文]. 武汉: 华中师范大学, 2018.
[12]  王芳. 基于支持向量机的沪深300指数回归预测[D]: [硕士学位论文]. 济南: 山东大学, 2015.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133