全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Why Can Multiple Imputations and How (MICE) Algorithm Work?

DOI: 10.4236/ojs.2021.115045, PP. 759-777

Keywords: Multiple Imputations, Imputations, Algorithms, MICE Algorithm

Full-Text   Cite this paper   Add to My Lib

Abstract:

Multiple imputations compensate for missing data and produce multiple datasets by regression model and are considered the solver of the old problem of univariate imputation. The univariate imputes data only from a specific column where the data cell was missing. Multivariate imputation works simultaneously, with all variables in all columns, whether missing or observed. It has emerged as a principal method of solving missing data problems. All incomplete datasets analyzed before Multiple Imputation by Chained Equations (MICE) presented were misdiagnosed; results obtained were invalid and should not be countable to yield reasonable conclusions. This article will highlight why multiple imputations and how the MICE work with a particular focus on the cyber-security dataset. Removing missing data in any dataset and replacing it is imperative in analyzing the data and creating prediction models. Therefore, a good imputation technique should recover the missingness, which involves extracting the good features. However, the widely used univariate imputation method does not impute missingness reasonably if the values are too large and may thus lead to bias. Therefore, we aim to propose an alternative imputation method that is efficient and removes potential bias after removing the missingness.

References

[1]  Huque, M.H., Carlin, J.B., Simpson, J.A. and Lee, K.J. (2018) A Comparison of Multiple Imputation Methods for Missing data in Longitudinal Studies. BMC Medical Research Methodology, 18, 1-16.
https://doi.org/10.1186/s12874-018-0615-6
[2]  Kontopantelis, E., White, I.R., Sperrin, M. and Buchan, I. (2017) Outcome-Sensitive Multiple Imputations: A Simulation Study. BMC Medical Research Methodology, 17, 1-13.
https://doi.org/10.1186/s12874-016-0281-5
[3]  Rubin, D.B. (1996) Multiple Imputation after 18+ Years. Journal of the American Statistical Association, 91, 473-489.
https://doi.org/10.1080/01621459.1996.10476908
[4]  Little, R.J.A. and Rubin, D.B. (2002) Statistical Analysis with Missing Data. 2nd Ed., Wiley Interscience, New York.
https://doi.org/10.1002/9781119013563
[5]  Van Buuren, S., Brand, J.P.L., Groothuis-Oudshoorn, C.G.M. and Rubin, D.B. (2006) Fully Conditional Specification in Multivariate Imputation. Journal of Statistical Computation and Simulation, 76, 1049-1064.
https://doi.org/10.1080/10629360600810434
[6]  Carpenter, J. and Kenward, M. (2013) Multiple Imputation and Its Application. 1st ed. Wiley, New York.
[7]  Rubin, D.B. (1993) Discussion: Statistical Disclosure Limitation. Journal of Official Statistics, 9, 461-468.
[8]  Rubin, D.B. (1987) Multiple Imputation for Nonresponse in Surveys. Wiley, New York.
https://doi.org/10.1002/9780470316696
[9]  White, I.R., Royston, P. and Wood, A.M. (2011) Multiple Imputation Using Chained Equations: Issues and Guidance for Practice. Statistics in Medicine, 30, 377-399.
https://doi.org/10.1002/sim.4067
[10]  Rubin, D.B. (2003) Discussion on Multiple Imputation. International Statistical Review, 71, 619-625.
https://doi.org/10.1111/j.1751-5823.2003.tb00216.x
[11]  Van Buuren, S. (2010) Multiple Imputation of Multilevel Data. In: Hox, J. and Roberts, K., Eds., The Handbook of Advanced Multilevel Analysis, Routledge, Milton Park, UK.
[12]  Van Buuren, S. and Oudshoorn, K. (2000) Multivariate Imputation by Chained Equations: MICE V1.0 User’s Manual, Volume PG/VGZ/00.038. TNO Prevention and Health, Leiden.
[13]  Scheidegger, A. (2012) adaptMCMC: Implementation of a Generic Adaptive Monte Carlo Markov Chain Sampler. R Package Version 1.1.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133