%0 Journal Article
%T On Imputing Binary Data via Pairwise Associations and Corresponding Conditional Probabilities
%A Irene B. HELENOWSKI
%A Hakan DEM？RTA？
%A Beyza DO？ANAY ERDO？AN
%J Turkiye Klinikleri Journal of Biostatistics
%D 2012
%I Turkiye Klinikleri
%X Objective: In this work, we present a method for imputing binary data without making any multinomial or loglinear model assumptions. Material and Methods: Our approach employs principles of generating binary data from multivariate normally distributed values, as discussed in Emrich and Piedmonte1. Specifically, a data set that follows a multivariate normal distribution is generated separately from observed data, using marginal binary proportions, and a matrix associated with pairwise tetrachoric correlations derived from phi coefficients. The same fraction of missing information is introduced in the generated data as in the original multivariate binary data, multiple imputation is then applied to the generated values via joint modeling under the normality assumption, and imputed values are dichotomized by quantiles corresponding to the binary proportions that are computed based on the original incomplete data. Results: Application of our imputation method to generated data in simulation studies and to real data examples led to promising results, as indicated by average estimates (AE) of pairwise correlation parameters comparable to the true correlation values associated with generated data and to the original correlation estimates involving real data, standardized bias (SB) values < 50%, small RMSE values associated with good accuracy and precision, coverage rates (CR) > 90%, and average widths (AW) of confidence intervals for correlation parameter estimates from the imputed data comparable to 95% confidence interval widths of true correlation values associated with generated data or original correlation estimates involving real data. Conclusion: Simulation studies and real data applications indicate that this new method is a promising approach in imputing binary data while relaxing multinomial and loglinear model assumptions.
%K Missing data
%K multiple imputation
%K multivariate normal distribution
%K binary data
%U http://www.turkiyeklinikleri.com/pdf/?pdf=6bdcf67dfb2f53f87fc320b730cb317f