全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Quasi-Negative Binomial: Properties, Parametric Estimation, Regression Model and Application to RNA-SEQ Data

DOI: 10.4236/ojs.2022.122016, PP. 216-237

Keywords: Queuing Models, Overdispersion, Moment Estimators, Delta Method, Bootstrap, Maximum Likelihood Estimation, Fisher’s Information, Orthogonal Polynomials, Regression Models, RNE-Seq Data

Full-Text   Cite this paper   Add to My Lib

Abstract:

Background: The Poisson and the Negative Binomial distributions are commonly used to model count data. The Poisson is characterized by the equality of mean and variance whereas the Negative Binomial has a variance larger than the mean and therefore both models are appropriate to model over-dispersed count data. Objectives: A new two-parameter probability distribution called the Quasi-Negative Binomial Distribution (QNBD) is being studied in this paper, generalizing the well-known negative binomial distribution. This model turns out to be quite flexible for analyzing count data. Our main objectives are to estimate the parameters of the proposed distribution and to discuss its applicability to genetics data. As an application, we demonstrate that the QNBD regression representation is utilized to model genomics data sets. Results: The new distribution is shown to provide a good fit with respect to the “Akaike Information Criterion”, AIC, considered a measure of model goodness of fit. The proposed distribution may serve as a viable alternative to other distributions available in the literature for modeling count data exhibiting overdispersion, arising in various fields of scientific investigation such as genomics and biomedicine.

References

[1]  Takács, L. (1962) A Generalization of the Ballot Problem and Its Application in the Theory of Queues. Journal of the American Statistical Association, 57, 327-337.
https://doi.org/10.1080/01621459.1962.10480662
[2]  Consul, P.C. and Gupta, H.C. (1980) The Generalized Negative Binomial Distribution and Its Characterization by Zero Regression. SIAM Journal of Applied Mathematics, 39, 231-237.
https://doi.org/10.1137/0139020
[3]  Consul, P.C. and Shenton, L.R. (1972) Use of Lagrange Expansion for Generating Generalized Probability Distributions. SIAM Journal of Applied Mathematics, 23, 239-248.
https://doi.org/10.1137/0123026
[4]  Consul, P.C. and Famoye, F. (2006) Lagrangian Probability Distributions. Birkhäuser, Boston.
[5]  Shoukri, M.M. (1980) Estimation of Generalized Discrete Distributions. Unpublished PhD Thesis, The University of Calgary, Calgary.
[6]  Nelder, J.A. and Wedderburn, R.W.M. (1972) Generalized Linear Models. Journal of the Royal Statistical Society, Series A, 135, 370-384.
https://doi.org/10.2307/2344614
[7]  Kendall, M. and Ord, K. (2009) The Advanced Theory of Statistics. Vol. 1, 6th Edition, Griffin, London.
[8]  Rudick, R., Antel, J., Confavreux, C., Confavreux, C., Cutter, G., Ellison, G., et al. (1996) Clinical Outcomes Assessment in Multiple Sclerosis. Annals of Neurology, 40, 469-479.
https://doi.org/10.1002/ana.410400321
[9]  Morgan, C.J., Aban, I.B., Katholi, C.R. and Cutter, G.R. (2010) Modeling Lesion Counts in Multiple Sclerosis When Patients Have Been Selected for Baseline Activity. Multiple Sclerosis, 16, 926-934.
https://doi.org/10.1177/1352458510373110
[10]  Cramér, H. (1946) Mathematical Methods of Statistics. Princeton University Press, Princeton.
[11]  Szegő, G. (1939) Orthogonal Polynomials. Vol. 23, Colloquium Publications, American Mathematical Society, New York.
[12]  Shenton, L.R. and Wallington, P.A. (1962) The Bias of the Moment Estimators with an Application to the Negative Binomial Distribution. Biometrika, 49, 193-204.
https://doi.org/10.1093/biomet/49.1-2.193
[13]  McCullagh, P. and Nelder, J.A. (1989) Generalized Linear Models. Chapman Hall, London.
[14]  Cox, D.R. and Hinkley, D. (1974) Theoretical Statistics. Chapman and Hall, London.
[15]  McCarthy, D.J., Chen, Y. and Smyth, G.K. (2021) Differential Expression Analysis of RNA-Seq Experiments with Respect to Biological Variation. Nucleic Acids Research, 40, 4288-4297.
https://doi.org/10.1093/nar/gks042
[16]  Pan, W. (2002) A Comparative Review of Statistical Methods for Discovering Differentially Expressed Genes in Replicated Microarray Experiments. Bioinformatics, 18, 546-554.
https://doi.org/10.1093/bioinformatics/18.4.546
[17]  Marioni, J.C., Mason, C.E., Mane, S.M., Stephens, M. and Gilad, Y. (2008) RNA-Seq: An Assessment of Technical Reproducibility and Comparison with Gene Expression Arrays. Genome Research, 18, 15-1517.
https://doi.org/10.1101/gr.079558.108
[18]  Koch, C.M., Chiu, S.F., Akbarpour, M., Bahart, A., Ridge, K.M., Bartom, E.T. and Winter, D.R. (2018) A Beginner’s Guide to Analysis of RNA Sequencing Data. American Journal of Respiratory Cell and Molecular Biology, 59, 145-157.
https://doi.org/10.1101/gr.079558.108
[19]  Yoon, S., Kim, S.Y. and Nam, D. (2016) Improving Gene-Set Enrichment Analysis of RNA-Seq Data with Small Replicates. PLoS ONE, 11, e0165919.
https://doi.org/10.1371/journal.pone.0165919
[20]  Auer, P.L. and Doerge, R.W. (2011) A Two-Stage Poisson Model for Testing RNA-Seq Data. Statistical Applications in Genetics and Molecular Biology, 10, 26.
https://doi.org/10.2202/1544-6115.1627
[21]  Yoon, S. and Nam, D. (2017) Gene Dispersion Is the Key Determinant of the Read Count Bias in Differential Expression Analysis of RNA-Seq Data. BMC Genomics, 18, Article No. 408.
https://doi.org/10.1186/s12864-017-3809-0
[22]  Robinson, M.D. and Smyth, G.K. (2008) Small-Sample Estimation of Negative Binomial Dispersion, with Applications to SAGE Data. Biostatistics, 9, 321-332.
https://doi.org/10.1093/biostatistics/kxm030
[23]  https://cran.r-project.org/bin/windows/base/

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133