OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

PLOS ONE 2013

Assessing Differential Expression in Two-Color Microarrays: A Resampling-Based Empirical Bayes Approach

DOI: 10.1371/journal.pone.0080099

Dongmei Li, Marc A. Le Pape, Nisha I. Parikh, Will X. Chen, Timothy D. Dye

Full-Text Cite this paper Add to My Lib

Abstract:

Microarrays are widely used for examining differential gene expression, identifying single nucleotide polymorphisms, and detecting methylation loci. Multiple testing methods in microarray data analysis aim at controlling both Type I and Type II error rates; however, real microarray data do not always fit their distribution assumptions. Smyth's ubiquitous parametric method, for example, inadequately accommodates violations of normality assumptions, resulting in inflated Type I error rates. The Significance Analysis of Microarrays, another widely used microarray data analysis method, is based on a permutation test and is robust to non-normally distributed data; however, the Significance Analysis of Microarrays method fold change criteria are problematic, and can critically alter the conclusion of a study, as a result of compositional changes of the control data set in the analysis. We propose a novel approach, combining resampling with empirical Bayes methods: the Resampling-based empirical Bayes Methods. This approach not only reduces false discovery rates for non-normally distributed microarray data, but it is also impervious to fold change threshold since no control data set selection is needed. Through simulation studies, sensitivities, specificities, total rejections, and false discovery rates are compared across the Smyth's parametric method, the Significance Analysis of Microarrays, and the Resampling-based empirical Bayes Methods. Differences in false discovery rates controls between each approach are illustrated through a preterm delivery methylation study. The results show that the Resampling-based empirical Bayes Methods offer significantly higher specificity and lower false discovery rates compared to Smyth's parametric method when data are not normally distributed. The Resampling-based empirical Bayes Methods also offers higher statistical power than the Significance Analysis of Microarrays method when the proportion of significantly differentially expressed genes is large for both normally and non-normally distributed data. Finally, the Resampling-based empirical Bayes Methods are generalizable to next generation sequencing RNA-seq data analysis.

References

[1]	Adkins RM, Krushkal J, Tylavsky FA, Thomas F (2011) Racial differences in gene-specific dna methylation levels are present at birth. Birth Defects Res A Clin Mol Teratol 91: 728–36.
[2]	Smyth GK (2004) Linear models and empirical bayes for asessingdifferential expression in microarray experiments. Statistical Application in Genetic Molecular Biology 3 Article 3.
[3]	Dudoit S, Yang YH, Callow MJ, Speed TP (2002) Statistical methods for indenifying differntially expressed genes in replicated cdna microarray experiments. Statistica Sinica 12: 111–139.
[4]	Tusher VG, Tibshirani R, Chu G (2001) Significant analysis of microarrays applied to the ionizing radiation response. Proceedings of the National Academy of Sciences 98: 5116–5121.
[5]	Efron B, Tibshirani RJ, Storey JD, Tusher V (2001) Empirical Bayes analysis of microarray experiment. Journal of American Statistical Association 96: 1151–1160.
[6]	L？nnstedt I, Speed TP (2002) Replicated microarray data. Statistica Sinica 12: 31–46.
[7]	Larsson O, Wahlestedt C, Timmons JA (2005) Considerations when using the significance analysis of microarrays (SAM) algorithm. BMC Bioinformatics 6: 129–134.
[8]	Edgar R, Domrachev M, Lash AE (2002) Gene expression omnibus: Ncbi gene expression and hybridization array data repository. Nucleic Acids Res 30: 207–210.
[9]	Soric B (1989) Statistical discoveries and effect-size estimation. Journal of the American Statistical Association 84: 608–610.
[10]	Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B (Methodological) 57: 289–300.
[11]	Good PI (2005) Permutation, parametric and bootstrap tests of hypotheses. Springer, 3rd edition.
[12]	Westfall PH, Young SS (1993) Resampling-based multiple testing: examples and methods for P-Value adjustment. New York: Wiley.
[13]	Calian V, Li D, Hsu JC (2008) Partitioning to uncover conditions for permutation tests to control multiple testing error rates. Biometrical Journal 50: 756–766.
[14]	Efron B (1979) Bootstrap methods: Another look at the jackknife. The Annals of Statistics 7: 1–26.
[15]	Efron B, Tibshirani RJ (1994) An introduction to the bootstrap. Chapman & Hall/CRC.
[16]	Freedman DA (1981) Bootstrapping regression models. The Annals of Statistics 9: 1218–1228.
[17]	Hall P (1986) On the bootstrap and confidence intervals. The Annals of Statistics 14: 1431–1452.
[18]	Pollard KS, van der Laan MJ (2005) Resampling-based multiple testing: Asymptotic control of type I error and applications to gene expression data. Journal of Statistical Planning and Inference 125: 85–100.
[19]	Kerr MK, Churchill GA (2001) Experimental design for gene expression microarrays. Biostatistics 2: 183–201.
[20]	Menon R, Conneely KN, Smith AK (2012) Dna methylation: an epigenetic risk factor in preterm birth. Reproductive Sciences 19: 6–13.
[21]	Parets SE, Conneely KN, Kilaru V, Fortunato SJ, Syed TA, et al. (2013) Fetal dna methylation associates with early spontaneous preterm birth and gestational age. PLoS One 8: e67489.
[22]	Jensen TG, Soi S, Wang L (2009) A bayesian approach to efficient differential allocation for resampling-based significance testing. BMC Bioinformatics 10: 198.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133