Background: In discrete-time event history analysis, subjects are measured once each
time period until they experience the event, prematurely drop out, or when the
study concludes. This implies measuring event status of a subject in each time
period determines whether (s)he should be measured in subsequent time periods.
For that reason, intermittent missing event status causes a problem because,
unlike other repeated measurement designs, it does not make sense to simply
ignore the corresponding missing event status from the analysis (as long as the
dropout is ignorable). Method: We used Monte Carlo simulation to
evaluate and compare various alternatives, including event occurrence recall,
event (non-)occurrence, case deletion, period deletion, and single and multiple
imputation methods, to deal with missing event status. Moreover, we showed the
methods’ performance in the analysis of an empirical example on relapse to drug
use. Result: The strategies assuming event (non-)occurrence and the
recall strategy had the worst performance because of a substantial parameter
bias and a sharp decrease in coverage rate. Deletion methods suffered from
either loss of power or undercoverageissues resulting
from a biased standard error. Single imputation recovered the bias issue but
showed an undercoverage estimate. Multiple imputations performed reasonably with a negligible
standard error bias leading to a gradual decrease in power. Conclusion: On the basis of the simulation results and real example, we provide practical
guidance to researches in terms of the best ways to deal with missing event
history data.
References
[1]
Cox, D. (1972) Regression Models and Life Tables. Journal of the Royal Statistical Society Series B, 34, 187-220. https://doi.org/10.1111/j.2517-6161.1972.tb00899.x
[2]
Moerbeek, M. and Schormans, J. (2015) The Effect of Discretizing Survival Times in Randomized Controlled Trials. Methodology, 11, 55-64.
https://doi.org/10.1027/1614-2241/a000091
[3]
Singer, J.D. and Willett, J.B. (2003) Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence. Oxford University Press, Oxford.
[4]
Singer, J.D. and Willett, J.B. (1993) It’s About Time: Using Discrete-Time Survival Analysis to Study Duration and the Timing of Events. Journal of Educational and Behavioral Statistics, 18, 155-195. https://doi.org/10.3102/10769986018002155
[5]
Hox, J., Moerbeek, M. and Van de Schoot, R. (2018) Multilevel Analysis. Techniques and Applications. 3rd Edition, Routledge, Boca Raton.
https://doi.org/10.4324/9781315650982
[6]
Tan, F., Jolani, S. and Verbeek, H. (2018). Guidelines for Multiple Imputations in Repeated Measurements with Time-Dependent Covariates: A Case Study. Journal of Clinical Epidemiology, 102, 107-114. https://doi.org/10.1016/j.jclinepi.2018.06.006
[7]
Schafer, J.L. and Graham, J.W. (2002) Missing Data: Our View of the State of the Art. Psychological Methods, 7, 147-177. https://doi.org/10.1037/1082-989X.7.2.147
[8]
Rubin, D. (1976) Inference and Missing Data. Biometrika, 63, 581-592.
https://doi.org/10.1093/biomet/63.3.581
[9]
Little, R. and Rubin, D. (2002) Statistical Analysis with Missing Data. Wiley, New York. https://doi.org/10.1002/9781119013563
[10]
Hedeker, D. and Gibbons, R.D. (2006) Longitudinal Data Analysis. Wiley, Hoboken.
[11]
Duncan, T.E., Duncan. S.C. and Strycker, L.A. (2006) An Introduction to Latent Variable Growth Curve Modeling. 2nd Edition, Erlbaum, Mahwah.
[12]
Bollen, K. and Curran, P. (2006) Latent Curve Models. A Structural Equation Perspective. Wiley, Hoboken. https://doi.org/10.1002/0471746096
[13]
Little, T. (2013) Longitudinal Structural Equation Modeling. Guilford Press, New York.
[14]
Newsom, J. (2015) Longitudinal Structural Equation Modeling. Routledge, New York. https://doi.org/10.4324/9781315871318
[15]
Treiman, D.J. (2008) Quantative Data Analysis. Jossey-Bass, San Francisco.
[16]
Van Buuren, S. (2018) Flexible Imputation of Missing Data. 2nd Edition, CRC Press, Boca Raton. https://doi.org/10.1201/9780429492259
[17]
R Core Team (2018) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna.
[18]
Van Buuren, S., Groothuis-Oudshoorn, K., Robitzsch, A., Vink, G., Doove, L. and Jolani, S. (2015) Package “Mice”, Computer Software.
https://cran.r-project.org/web/packages/mice/index.html
[19]
Muthén, L.K. and Muthén, B.O. (2002) How to Use a Monte Carlo Study to Decide on Sample Size and Determine Power. Structural Equation Modeling, 9, 599-620.
https://doi.org/10.1207/S15328007SEM0904_8
[20]
Cohen, J. (1988) Statistical Power Analysis for the Behavioral Sciences. 2nd Edition, Erlbaum, Hillsdale.
[21]
IBM Corp. (2016) IBM SPSS Statistics for Windows, Version 24.0. IBM Corp., Armonk.
[22]
Hosmer, D.W., Lemeshow, S. and May, S. (2008) Applied Survival Analysis. Regression Modeling of Time-to-Event Data. 2nd Edition, Wiley, Hoboken.
https://doi.org/10.1002/9780470258019
[23]
Jolani, S. and Safarkhani, M. (2017) The Effect of Partly Missing Covariates on Statistical Power in Randomized Controlled Trials with Discrete-Time Survival Endpoints. Methodology, 13, 41-60. https://doi.org/10.1027/1614-2241/a000121