|
计算机科学 2007
The PP Principal Component Based on Kernel and its Application in Clustering with Outliers
|
Abstract:
The data dimension reduction is the main method that can enhance the outliers mining efficiency based on higher-dimension data set. A novel clustering with outlier algorithm that is a combination of the kernel method and PP principal component is proposed after analyzing the advantages and disadvantages of the classical outlier mining algorithm in the paper. In this paper, we introduce data transformation of PP principal component based on kernel to reduce data dimension. Through the data transformation matrix, we can obtain nonlinear data dimension and add an additional weighting factor for each vector. On the basis of modifying iterative functions derived from obiective function for fuzzy clustering, the final weight value of a datum represents a kind of representativeness of the corresponding datum. With these weight values, the experts can identify the outliers easily. Thetheoretical analysis indicate that the algorithm is converged finally. Simulation results illustrate that this algorithm is very efficient.