全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Empirical study of supervised gene screening

DOI: 10.1186/1471-2105-7-537

Full-Text   Cite this paper   Add to My Lib

Abstract:

We investigate concordance and reproducibility of supervised gene screening based on eight commonly used marginal statistics. Concordance is assessed by the relative fractions of overlaps between top ranked genes screened using different marginal statistics. We propose a Bootstrap Reproducibility Index, which measures reproducibility of individual genes under the supervised screening. Empirical studies are based on four public microarray data. We consider the cases where the top 20%, 40% and 60% genes are screened.From a gene discovery point of view, the effect of supervised gene screening based on different marginal statistics cannot be ignored. Empirical studies show that (1) genes passed different supervised screenings may be considerably different; (2) concordance may vary, depending on the underlying data structure and percentage of selected genes; (3) evaluated with the Bootstrap Reproducibility Index, genes passed supervised screenings are only moderately reproducible; and (4) concordance cannot be improved by supervised screening based on reproducibility.Microarray techniques provide a way of monitoring gene expressions on a large scale. Biomedical experiments have been designed to discover important genes or gene pathways, that are linked with variations of phenotypes. Those genes can then be used as biomarkers in clinical studies and to construct predictive models in downstream analysis. Examples of such studies include disease classification studies in [1-3] and survival analysis in [4,5], among many others.Statistical analyses using gene expressions as covariates are very challenging due to high dimensionality of gene expression measurements and small sample sizes. Consider for example the Leukemia data [6], which is used as an example of binary classification in [7]. The data contains expression measurements of 6817 genes from 72 samples. We refer to [6] for experimental setup. A typical analysis, as presented in [7], consists of the following three ste

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133