Several statistical methods have been developed for adjusting the Odds Ratio of the relation between two dichotomous variables X and Y for some confounders Z. With the exception of the Mantel-Haenszel method, commonly used methods, notably binary logistic regression, are not symmetrical in X and Y. The classical Mantel-Haenszel method however only works for confounders with a limited number of discrete strata, which limits its utility, and appears to have no basis in statistical models. Here we revisit the Mantel-Haenszel method and propose an extension to continuous and vector valued Z. The idea is to replace the observed cell entries in strata of the Mantel-Haenszel procedure by subject specific classification probabilities for the four possible values of (X,Y) predicted by a suitable statistical model. For situations where X and Y can be treated symmetrically we propose and explore the multinomial logistic model. Under the homogeneity hypothesis, which states that the odds ratio does not depend on Z, the logarithm of the odds ratio estimator can be expressed as a simple linear combination of three parameters of this model. Methods for testing the homogeneity hypothesis are proposed. The relationship between this method and binary logistic regression is explored. A numerical example using survey data is presented.
References
[1]
Mantel N, Haenszel W (1959) Statistical aspects of the analysis of data from retrospective studies of disease. J Natl Cancer Inst. 22: 719–748.
[2]
Hosmer DW, Lemeshow S (2000) Applied logistic regression 2nd ed. New York: Wiley.
[3]
Miettinen OS (1976) Stratification by a multivariate confounder score. Am J Epidemiol. 104: 609–620.
[4]
Pike MC, Anderson J, Day N (1979) Some insights into Miettinen's multivariate confounder score approach to case-control study analysis. Epidemiol Community Health. 33: 104–106.
[5]
Carey V, Zeger SL, Diggle P (1993) Modelling multivariate binary data with alternating logistic regressions. Biometrika 80: 517–526.
[6]
Le Cessie S, van Houwelingen JC (1994) Logistic regression for correlated binary data. Appl Statist 43: 95–108.
[7]
Agresti A (2002) Categorical data analysis 2nd ed. Hoboken: Wiley.
[8]
R Core Team (2012) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria Available: http://www.R-project.org/.
[9]
Fidler V, Nagelkerke N (2013) partialOR. R package version 0.9 Available: http://CRAN.R-project.org/package=partia?lOR.
[10]
Centers for Disease Control and Prevention. Available: http://www.cdc.gov/nchs/nhanes/nhanes199?9-2000/nhanes99_00.htm. Accessed 2005 Oct 1.
[11]
Rothman JK, Greenland S, Lash TL (2008) Modern Epidemiology 3rd ed. Philadelphia: Lippincott Williams & Wilkins.
[12]
Robins JM (2001) Data, design, and background knowledge in etiologic inference. Epidemiology 12: 313–320.