Objectives: The objective is to analyze the
interaction of the correlation structure and values
of the regressor variables in the estimation of a linear model when there is a
constant, possibly negative, intra-class correlation of residual errors and the
group sizes are equal. Specifically: 1) How does the variance of the
generalized least squares (GLS) estimator (GLSE) depend on the regressor values?
2) What is the bias in estimated
variances when ordinary least squares (OLS) estimator is used? 3) In what cases
are OLS and GLS equivalent. 4) How can the best linear unbiased estimator (BLUE) be constructed
when the covariance matrix is singular? The purpose is to make general matrix
results understandable. Results: The effects of the regressor values can
be expressed in terms of the intra-class correlations of the regressors. If the
intra-class correlation of residuals is large, then it is beneficial to have
small intra-class correlations of the regressors, and vice versa. The algebraic
presentation of GLS shows how the GLSE gives different weight to the
between-group effects and the within-group effects, in what cases OLSE is equal
to GLSE, and how BLUE can be constructed when the residual covariance matrix is
singular. Different situations arise when the intra-class correlations of the
regressors get their extreme values or intermediate values. The derivations
lead to BLUE combining OLS and GLS weighting in an estimator, which can be obtained
also using general matrix theory. It is indicated how the analysis can be
generalized to non-equal group sizes. The analysis gives insight to models
where between-group effects and within-group
effects are used as separate regressors.
References
[1]
Markiewics, A., Puntanen, S. and Styan, P.H. (2022) The Legend of the Equality of OLSE and BLUE: Highlighted by C. R. Rao in 1967. In: Arnold, B.C., Balakrishnan, N. and Coelho, C.A., Eds., Methodology and Applications of Statistics, Springer, Cham.
[2]
Jeske, D.R. and Myhre, J.M. (2018) Regression Using Pairs vs. Regression on Differences: A Real-life Case Study for a Master’s Level Methods Class, The American Statistician, 72, 163-168. https://doi.org/10.1080/00031305.2017.1292956
[3]
Christensen, R. (2003) Significantly insignificant F tests. The American Statistician. 57, 27-32. https://doi.org/10.1198/0003130031108
[4]
Puntanen, S. and Styan, G.P.H. (1989) The Equality of the Ordinary Least Squares Estimator and the Best Linear Unbiased Estimator. The American Statistician, 43, 153-161. https://doi.org/10.1080/00031305.1989.10475644
[5]
Puntanen, S., Styan, G.P.H. and Isotalo, J. (2011) Matrix Tricks for Linear Statistical Models: Our Personal Top Twenty. Springer, Heidelberg.
https://doi.org/10.1007/978-3-642-10473-2
[6]
Christensen, R. (2011) Plane Answers to Complex Questions. The Theory of Linear Models. 4th Edition, Springer, New York.
https://doi.org/10.1007/978-1-4419-9816-3
[7]
Kendall, M. and Stuart, A. (1979) The Advanced Theory of Statistics: Volume 2: Inference and Relationship. 4th Edition, Charles Griffin, London.
[8]
Mehtätalo, L. and Lappi, J. (2020) Biometry for Forestry and Environmental Data with Examples in R. Chapman and Hall/CRC, New York.
https://doi.org/10.1201/9780429173462
[9]
Snijders, T. and Bosker, R.J. (1994) Modeled Variance in Two-Level Models. Sociological Methods Research, 22, 342-363.
https://doi.org/10.1177/0049124194022003004
[10]
Pryseley, A., Tchonlafi, C., Verbeke, G. and Molenberghs, G. (2011) Estimating Negative Variance Components from Gaussian and Non-Gaussian Data: A Mixed Models Approach. Computational Statistics and Data Analysis, 55, 1071-1085.
https://doi.org/10.1016/j.csda.2010.09.002