|
自动化学报 2008
State-of-the-art of Cluster Analysis of Gene Expression Data
|
Abstract:
The flood of gene expression data provided by the DNA microarray technology has driven the development of automated analysis techniques and tools.Cluster analysis is an effective and practical method to mine the huge amount of gene expression data to gain important genetic and biological information.Many improved conventional clustering algorithms as well as new clustering algorithms have been proposed recently to process the gene expression data.This survey first introduces how to produce and represent the gene expression data,and then discusses the state-of-the-art cluster algorithms applied to gene expression data.According to the goals of clustering,clustering algorithms are divided into three categories:gene-based clustering,sample-based clustering,and biclustering.Basic biological principles and challenges for each category are presented.For each category,the basic principle is discussed in detail as well as its advantages and drawbacks.This paper concludes with a summarization in this field and a discussion of future trends.