虽然互联网快速进步发展,但也带来了大量的网络数据流,随之而来的是数据的综合存储,数据的综合计算和数据分析等诸多问题,各种业务系统的复杂多样化,数据分析的实效性要求也变得越来越高,先前常用的离线分析很多已经不适用于当今的生产需要,如今对数据的推荐系统在实时性方面有了更高的需求。基于矩阵分解的推荐算法作为目前较为流行的推荐算法,不论从预测的准确度还是预测的精确度都要明显地优于其它的算法。但传统的矩阵分解方法在处理大规模数据时存在计算速度慢和计算资源不足的问题。Flink大数据框架作为当前热门的流数据处理框架,在迭代计算与流数据处理上有明显的优势。本文将矩阵分解方法与Flink处理相结合,在原有的矩阵分解推荐算法的基础上,提出一种基于Flink的矩阵分解算法的优化模型,解决了矩阵分解在大数据环境下的瓶颈。
Although progress and rapid development of the Internet also brought a lot of network data flow, the following is the comprehensive storage of data, data comprehensive calculation and data analysis and many other problems. With the complexity and diversification of various business systems, the requirements for the effectiveness of data analysis have become increasingly high. In the past, most offline analysis commonly used is no longer applicable to today’s production needs. Now the data recommendation system is requested to have a higher demand in real time. As a popular recommendation algorithm at present, the recommendation algorithm based on matrix decomposition is obviously superior to other algorithms in terms of accuracy and accuracy of prediction. However, the traditional matrix decomposition method has the problems of slow computation speed and insufficient computation resources when dealing with large-scale data. As a popular streaming data processing framework, Flink big data framework has obvious advantages in iterative computation and streaming data processing. In this paper, matrix decomposition method is combined with Flink processing. On the basis of the original matrix decomposition recommendation algorithm, an optimization model of matrix decomposition algorithm based on Flink is proposed to solve the bottleneck of matrix decomposition in the big data environment.