In the experimental field,
researchers need very often to select the best subset model as well as reach
the best model estimation simultaneously. Selecting the best subset of
variables will improve the prediction accuracy as noninformative variables will
be removed. Having a model with high prediction accuracy allows the researchers
to use the model for future forecasting. In this paper, we investigate the
differences between various variable selection methods. The aim is to compare
the analysis of the frequentist methodology (the backward elimination),
penalised shrinkage method (the Adaptive LASSO) and the Least Angle Regression (LARS)
for selecting the active variables for data produced by the blocked design
experiment. The result of the comparative study supports the utilization of the
LARS method for statistical analysis of data from blocked experiments.
References
[1]
Hoerl, A.E. and Kennard, R.W. (1970) Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics, 12, 55-67. https://doi.org/10.1080/00401706.1970.10488634
[2]
Breiman, L., et al. (1996) Heuristics of Instability and Stabilization in Model Selection. The Annals of Statistics, 24, 2350-2383. https://doi.org/10.1214/aos/1032181158
[3]
Li, R. and Lin, D.K.J. (2003) Analysis Methods for Supersaturated Design: Some Comparisons. Journal of Data Science, 1, 249-260. https://doi.org/10.6339/JDS.2003.01(3).134
[4]
Mylona, K. and Goos, P. (2011) Penalized Generalized Least Squares for Model Selection under Restricted Randomization. Manuscript Submitted for Isaac Newton Institute for Mathematical Sciences; NI 11032.
[5]
Aljeddani, S. (2019) Statistical Analysis of Data from Experiments Subject to Restricted Randomisatio. PhD Thesis, University of Southampton, Southampton.
[6]
Ozawa, S., Nagatani, T. and Abe, S. (2010) Fast Variable Selection by Block Addition and Block Deletion. Journal of Intelligent Learning Systems and Applications, 2, 200-211. https://doi.org/10.4236/jilsa.2010.24023
[7]
Matthews, E.S. (2015) Design of Factorial Experiments in Blocks and Stages. PhD Thesis, University of Southampton, Southampton.
[8]
Corbeil, R.R. and Searle, S.R. (1976) Restricted Maximum Likelihood (Reml) Estimation of Variance Components in the Mixed Model. Technometrics, 18, 31-38. https://doi.org/10.2307/1267913
[9]
Fan, J.Q. and Li, R.Z. (2001) Variable Selection via Nonconcave Penalized Likelihood and Its Oracle Properties. Journal of the American statistical Association, 96, 1348-1360. https://doi.org/10.1198/016214501753382273
[10]
Hastie, T., Tibshirani, R. and Friedman, J. (2009) Unsupervised Learning. In The Elements of Statistical Learning, Springer, New York, 485-585. https://doi.org/10.1007/978-0-387-84858-7_14
[11]
Tibshirani, R. (1996) Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society, Series B (Methodological), 58, 267-288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
[12]
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R., et al. (2004) Least Angle Regression. The Annals of Statistics, 32, 407-499. https://doi.org/10.1214/009053604000000067
[13]
Wahba, G. (1980) Spline Bases, Regularization, and Generalized Cross Validation for Solving Approximation Problems with Large Quantities of Noisy Data. Proceedings of the International Conference on Approximation Theory in Honor or George Lorenz, Austin, 8-11 January 1980.
[14]
Zou, H. (2006) The Adaptive Lasso and Its Oracle Properties. Journal of the American Statistical Association, 101, 1418-1429. https://doi.org/10.1198/016214506000000735
[15]
Zou, H. and Zhang, H.H. (2009) On the Adaptive Elastic-Net with a Diverging Number of Parameters. Annals of Statistics, 37, 1733. https://doi.org/10.1214/08-AOS625
[16]
Gilmour, S.G. and Trinca, L.A. (2000) Some Practical Advice on Polynomial Regression Analysis from Blocked Response Surface Designs. Communications in Statistics-Theory and Methods, 29, 2157-2180. https://doi.org/10.1080/03610920008832601
[17]
Goos, P. (2002) The Optimal Design of Blocked and Split-Plot Experiments. Springer Science & Business Media, New York. https://doi.org/10.1007/978-1-4613-0051-9
[18]
Letsinger, J.D., Myers, R.H. and Lentner, M. (1996) Response Surface Methods for Birandomization Structures. Journal of Quality Technology, 28, 381-397. https://doi.org/10.1080/00224065.1996.11979697
[19]
Ibrahim, J.G., Zhu, H.T., Garcia, R.I. and Guo, R.X. (2011) Fixed and Random Effects Selection in Mixed Effects Models. Biometrics, 67, 495-503. https://doi.org/10.1111/j.1541-0420.2010.01463.x