It is known that both the Pearson and G-test of independence have the same asymptotic distribution, namely
with
degrees of freedom, where
and
are the number of levels, respectively, of two attributes. However, when extending the accuracy ot this approximation by
-proportional terms, the resulting corrections differ quite dramatically. The purpose of this article is to derive each of these corrections and demonstrate their use in practical application.
References
[1]
Kendall, M.G. and Stuart, A. (1973) The Advanced Theory of Statistics, Vol. 2, Chapter 33. Harper Publishing Company.
[2]
MacDonald, H.J. (2014) G-Test of Goodness-of-Fit, Handbook of Biological Statistics. 3rd Edition, Sparky House Publishing, 53-58.
[3]
Sokal, R.R. and Rohlf, F.J. (1981) Biometry: The Principles and Practice of Statistics in Biological Research. 2nd Edition, Freeman.
[4]
Wikipedia. Kronecker Product.
[5]
Wikipedia. Woodbury Matrix Identity.
[6]
Wikipedia. Sylvester’s Determinant Identity.
[7]
Serfling, R.J. (1980). Approximation Theorems of Mathematical Statistics. Wiley. https://doi.org/10.1002/9780470316481
[8]
Vrbik, J. (2014) Improving Accuracy of Test of Independence. Advances and Applications in Statistics, 39, 91-94.