%0 Journal Article %T Beyond logistic regression: structural equations modelling for binary variables and its application to investigating unobserved confounders %A Emil Kupek %J BMC Medical Research Methodology %D 2006 %I BioMed Central %R 10.1186/1471-2288-6-13 %X A large data set with a known structure among two related outcomes and three independent variables was generated to investigate the use of Yule's transformation of odds ratio (OR) into Q-metric by (OR-1)/(OR+1) to approximate Pearson's correlation coefficients between binary variables whose covariance structure can be further analysed by SEM. Percent of correctly classified events and non-events was compared with the classification obtained by logistic regression. The performance of SEM based on Q-metric was also checked on a small (N = 100) random sample of the data generated and on a real data set.SEM successfully recovered the generated model structure. SEM of real data suggested a significant influence of a latent confounding variable which would have not been detectable by standard logistic regression. SEM classification performance was broadly similar to that of the logistic regression.The analysis of binary data can be greatly enhanced by Yule's transformation of odds ratios into estimated correlation matrix that can be further analysed by SEM. The interpretation of results is aided by expressing them as odds ratios which are the most frequently used measure of effect in medical statistics.Although logistic regression has become the cornerstone of modelling categorical outcomes in medical statistics, separate regression analysis for each outcome of interest is hardly challenged as a pragmatic approach even in the situations when the outcomes are naturally related. This is common in process evaluation where the same variable can be an outcome at one point in time and a predictor of another outcome in future. For example, preterm delivery is both an important obstetric outcome and a risk factor for low birthweight, which in turn can adversely affect future health. Sequential nature of these outcomes is not encompassed by repeated measures models which deal with the same outcome at different time points. Another example of a research problem difficult to handle %U http://www.biomedcentral.com/1471-2288/6/13