|
ANMM4CBR: a case-based reasoning method for gene expression data classificationAbstract: In order to obtain a robust classifier, a novel Additive Nonparametric Margin Maximum for Case-Based Reasoning (ANMM4CBR) method is proposed in this article. ANMM4CBR employs a case-based reasoning (CBR) method for classification. CBR is a suitable paradigm for microarray analysis, where the rules that define the domain knowledge are difficult to obtain because usually only a small number of training samples are available. Moreover, in order to select the most informative genes, we propose to perform feature selection via additively optimizing a nonparametric margin maximum criterion, which is defined based on gene pre-selection and sample clustering. Our feature selection method is very robust to noise in the data.The effectiveness of our method is demonstrated on both simulated and real data sets. We show that the ANMM4CBR method performs better than some state-of-the-art methods such as support vector machine (SVM) and k nearest neighbor (kNN), especially when the data contains a high level of noise.The source code is attached as an additional file of this paper.Recently gene microarray technology has become a fundamental tool in biomedical research, enabling us to simultaneously observe the expression of thousands of genes on the transcriptional level. Two typical problems that researches want to solve using microarray data are: (1) discovering informative genes for classification based on different cell-types or diseases [1]; (2) clustering and arranging genes according to their similarity in expression patterns [2]. Here we focus on the former, especially on microarray classification using gene expression data, which has attracted extensive attentions in the last few years. It is believed that gene expression profiling could be a precise and systematic approach for cancer diagnosis and clinical-outcome prediction [3].With about ten years of research, many algorithms have been applied to microarray classification, such as nearest neighbor (NN) [4], artificial n
|