|
计算机应用研究 2010
Approach of Web clustering based on hybrid particle swarm optimization model
|
Abstract:
This paper analyzed the status of Web mining in e-commerce environment, considered the massive and high-dimensional Web data, in order to extract the implicit and unknown knowledge, brought the complexity and the curse of dimensiona-lity. Based on the K-means clustering, particle swarm optimization(PSO) clustering and hybrid PSO clustering algorithms, presented a combination model based on principal component analysis(PCA) and hybrid PSO to cluster log files in the Web servers. The interrelated Web data have been processed by principal component analysis, the results of PCA are input data for hybrid PSO clustering algorithms. It not only reduces the number of input variable and the size of clustering, but also reserve the main information of original variables and eliminates of multicollinearity between the variables; presented an effective mo-del of Web data clustering which have characters of massive, high-dimensional and heterogeneous.