|
计算机科学 2007
CIS: An Iterative Spread-based Algorithm for Clustering Micro-array Data
|
Abstract:
DNA Micro-array technique makes it possible to simultaneously monitor the expression levels of tens of thousands of genes.The traditional clustering methods will suffer from the curse of dimensionality when directly applied to Micro-array data.The two common dimensionality reduction methods,i.e.feature transformation and feature selection,are unsuitable for the analysis of Micro-array data,since the former generates the new features difficult to interpret and the latter misses some information.Besides,most traditional clustering algorithms need the user-specific parameters,which may result in quite different results.In this paper,we present an iterative spread-based algorithm,namely CIS,for clustering Micro-array data,which selects threshold automatically.Instead of feature selection and feature transformation,in a progressively refining manner,CIS repeatedly partitions the genes with the new-generated sample clusters as features,and then partitions the samples with the new-generated gene clusters as features.The algorithm is applied to two real gene Micro-array data sets.Experiment results confirm its effectiveness and efficiency.