针对协同过滤算法数据稀疏导致推荐质量低和推荐效率低的问题,本文提出了一种基于改进K-means聚类与用户属性的协同过滤推荐算法。为了改进K-means算法初始中心选取的随机性,本文先用canopy算法对数据进行粗聚类,引入“最大最小距离积法”选取初始点,接着用K-means算法进行聚类,在生成多个聚类簇之后,将修正的余弦相似度与用户属性特征相结合,形成新的相似度计算模型,最后进行相应的推荐。通过MAE、RMSE两个指标的比较,结果表明,改进后的算法能够提高推荐效率和推荐准确性。
In order to solve the problem of low recommendation quality and low recommendation efficiency, which is caused by data sparseness in collaborative filtering algorithm, a collaborative filtering recommendation algorithm based on improved K-means clustering and user attribute was pro-posed in order to improve the randomness of initial center selection of K-means algorithm. In this paper, Canopy algorithm was used to perform crude clustering of data, and “maximum and mini-mum distance product method” was introduced to select initial points. Then, K-means algorithm was used for clustering. After the generation of multiple clustering clusters, the revised cosine similarity and user attribute characteristics are combined to form a new similarity calculation model. Finally, the corresponding recommendation is made. Through the comparison of MAE and RMSE, the results show that the improved algorithm can improve the efficiency and accuracy of recommendation.