|
计算机科学 2006
Gene Expression Data Feature Selection Based on GA and Clustering
|
Abstract:
Feature selection is one of the important problems in the pattern recognition and data mining areas. For highdimensional data such as gene expression data, feature selection not only can improve the accuracy and efficiency of classification and clustering, but also can discover informative feature subset, such as genes highly related to some diseases. This paper proposes a new feature selection method for the gene expression data, which realizes the feature subset search by genetic algorithm, and the feature subset is evaluated by the clustering algorithm and the error rate. The experiments show that the proposed algorithm can find the feature subsets with good separability, which results in the good clustering and classification accuracy.