全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

A comparison of univariate and multivariate gene selection techniques for classification of cancer datasets

DOI: 10.1186/1471-2105-7-235

Full-Text   Cite this paper   Add to My Lib

Abstract:

In this study we adopted an unbiased protocol to perform a fair comparison of frequently used multivariate and univariate gene selection techniques, in combination with a r?nge of classifiers. Our conclusions are based on seven gene expression datasets, across several cancer types.Our experiments illustrate that, contrary to several previous studies, in five of the seven datasets univariate selection approaches yield consistently better results than multivariate approaches. The simplest multivariate selection approach, the Top Scoring method, achieves the best results on the remaining two datasets. We conclude that the correlation structures, if present, are difficult to extract due to the small number of samples, and that consequently, overly-complex gene selection algorithms that attempt to extract these structures are prone to overtraining.Gene expression microarrays enable the measurement of the activity levels of thousands of genes on a single glass slide. The number of genes (features) is in the order of thousands while the number of arrays is usually limited to several hundreds, due to the high cost associated with the procedure and the sample availability. In classification tasks a reduction of the feature space is usually performed [1,2]. On the one hand it decreases the complexity of the classification task and thus improves the classification Performance [3-7]. This is especially true when the classifiers employed are sensitive to noise. On the other hand it identifies relevant genes that can be potential biomarkers for the problem under study, and can be used in the clinic or for further studies, e.g. as targets for new types of therapies.A widely used search strategy employs a criterion to evaluate the informativeness of each gene individually. We refer to this approach as univariate gene selection. Several criteria have been proposed in the literature, e.g. Golub et al. [8] introduced the signal-to-noise-ratio (SNR), also employed in [9,10]. Bendor et

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133