|
计算机应用研究 2011
Global outlier detection based on hierarchical clustering
|
Abstract:
The existing outlier detection algorithms should be improved due to their versatility,effectiveness,user-friendliness,and the performance in processing high-dimensional and large databases.A fast and effective hierarchical clustering based global outlier detection approch is proposed in this paper. Agglomerative hierarchical clustering is performed firstly,and then the isolated degree of the data can be visually judged and the number of the outliers can be determined based on the clustering tree and the distance matrix.After that, the outliers is identified unsupervisedly from the top to down of the clustering tree.Experimental results show that,this approch can identify global outliers fastly and effectively,and is user-friendly and capable at datasets of various shapes.Experiments also illustrate that this approach is suitable for use on high-dimensional and large databases.