In several instances of statistical
practice, it is not uncommon to use the same data for both model selection and
inference, without taking account of the variability induced by model selection
step. This is usually referred to as post-model selection inference. The
shortcomings of such practice are widely recognized, finding a general solution
is extremely challenging. We propose a model averaging alternative consisting
on taking into account model selection probability and the like-lihood in
assigning the weights. The approach is applied to Bernoulli trials and
outperforms Akaike weights model averaging and post-model selection estimators.
References
[1]
Berk, R., Brown, L. and Zhao, L. (2010) Statistical Inference after Model Selection. Journal of Quantitative Criminology, 26, 217-236. http://dx.doi.org/10.1007/s10940-009-9077-7
[2]
Berk, R., Brown, L., Buja, A., Zhang, K. and Zhao, I. (2013) Valid Post-Selection Inference. Annals of Statistics, 41, 802-837. http://dx.doi.org/10.1214/12-aos1077
[3]
Belloni, A., Chernozhukov, V. and Kato, K. (2015) Uniform Post-Selection Inference for Least Absolute Deviation Regression and Other Z-Estimation Problems. Biometrika, 102, 77-94. http://dx.doi.org/10.1093/biomet/asu056
[4]
Belloni, A., Chernozhukov, V. and Wei, Y. (2016) Post-Selection Inference for Generalized Linear Models with Many Controls. Journal of Business and Economic Statistics. http://dx.doi.org/10.1080/07350015.2016.1166116
[5]
Tibshirani, R.J., Taylor, J., Lockhart, R. and Tibshirani, R. (2014) Exact Post-Selection Inference for Sequential Regression Procedures. arXiv:1401.3889.
[6]
Chernozhukov, V., Hansen, C. and Spindler, M. (2015) Valid Post-Selection and Post-Regularization Inference: An Elementary, General Approach. Annual Review of Economics, 7, 649-688.
http://dx.doi.org/10.1146/annurev-economics-012315-015826
[7]
Zucchini, W. (2000) An Introduction to Model Selection Journal of Mathematical Psychology, 44, 41-61.
http://dx.doi.org/10.1006/jmps.1999.1276
[8]
Zucchini, W., Claeskens, G. and Nguefack-Tsague, G. (2011) Model Selection. International Encyclopedia of Statistical Science, Springer, Berlin Heidelberg, 830-833. http://dx.doi.org/10.1007/978-3-642-04898-2_373
[9]
Schwarz, G. (1978) Estimating the Dimension of a Model. Annals of Statistics, 6, 461-465.
http://dx.doi.org/10.1214/aos/1176344136
[10]
Hoeting, J.A., Madigan, D., Raftery, A.E. and Volinsky, C.T. (1999) Bayesian Model Averaging: A Tutorial (with Discussions). Statistical Science, 14, 382-417.
[11]
Nguefack-Tsague, G. (2011) Using Bayesian Networks to Model Hierarchical Relationships in Epidemiological Studies. Epidemiology and Health, 33, e201100633. http://dx.doi.org/10.4178/epih/e2011006
[12]
Nguefack-Tsague, G. (2013) Bayesian Estimation of a Multivariate Mean under Model Uncertainty. International Journal of Mathematics and Statistics, 13, 83-92.
[13]
Nguefack-Tsague, G. and Ingo, B. (2014) A Focused Bayesian Information Criterion. Advances in Statistics, 2014, Article ID: 504325. http://dx.doi.org/10.1155/2014/504325
[14]
Nguefack-Tsague, G. and Zucchini W. (2016) A Mixture-Based Bayesian Model Averaging Method. Open Journal of Statistics, 6, 220-228. http://dx.doi.org/10.4236/ojs.2016.62019
[15]
Nguefack-Tsague, G. and Zucchini, W. (2016) Effects of Bayesian Model Selection on Frequentist Performances: An Alternative Approach. Applied Mathematics, 7, 1103-1105. http://dx.doi.org/10.4236/am.2016.710098/am.2016.710098
[16]
Hansen, B.E. (2007) Least Squares Model Averaging. Econometrica, 75, 1175-1189.
http://dx.doi.org/10.1111/j.1468-0262.2007.00785.x
[17]
Hansen, B.E. (2008) Least-Squares Forecast Averaging. Journal of Econometrics, 146, 342-350.
http://dx.doi.org/10.1016/j.jeconom.2008.08.022
[18]
Hansen, B.E. (2009) Averaging Estimators for Regressions with a Possible Structural Break. Econometric Theory, 25, 1498-1514. http://dx.doi.org/10.1017/S0266466609990235
[19]
Hansen, B.E. (2010) Averaging Estimators for Autoregressions with a Near Unit Root. Journal of Econometrics, 158, 142-155. http://dx.doi.org/10.1016/j.jeconom.2010.03.022
[20]
Hansen, B.E. (2014) Model Averaging, Asymptotic risk, and Regressor Groups. Quantitative Economics, 5, 495-530.
http://dx.doi.org/10.3982/QE332
[21]
Hansen, B.E. (2014) Nonparametric Sieve Regression: Least Squares, Averaging Least Squares, and Cross-Validation. In: Racine, J., Su, L.J. and Ullah, A., Eds., Handbook of Applied Nonparametric and Semiparametric Econometrics and Statistics, Oxford University Press, Oxford, 215-248.
[22]
Hansen, B.E. and Racine, J.S. (2012) Jackknife Model Averaging. Journal of Econometrics, 167, 38-46.
http://dx.doi.org/10.1016/j.jeconom.2011.06.019
[23]
Cheng, X. and Hansen, B.E. (2015) Forecasting with Factor-Augmented Regression: A Frequentist Model Averaging Approach. Journal of Econometrics, 186, 280-293. http://dx.doi.org/10.1016/j.jeconom.2015.02.010
[24]
Charkhi, A., Claeskens, G. and Hansen, B.E. (2016) Minimum Mean Squared Error Model Averaging in Likelihood Models. Statistica Sinica, 26, 809-840. http://dx.doi.org/10.5705/ss.202014.0067
[25]
Wan, A.T.K., Zhang X. and Zou, G. (2010) Least Squares Model Averaging by Mallows Criterion. Journal of Econometrics, 156, 277-283. http://dx.doi.org/10.1016/j.jeconom.2009.10.030
[26]
Akaike, H. (1973) Information Theory and an Extension of the Maximum Likelihood Principle. 2nd International Symposium on Information Theory, Akademiai Kiado, Budapest, 267-281.
[27]
Burnham, K.P. and Anderson, D.R. (2013) Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach. Springer, Cambridge.
[28]
Nguefack-Tsague, G. and Zucchini, W. (2011) Post-Model Selection Inference and Model Averaging. Pakistan Journal of Statistics and Operation Research, 7, 347-361. http://dx.doi.org/10.18187/pjsor.v7i2-Sp.292
[29]
Nguefack-Tsague, G. (2013) On Bootstrap and Post-Model Selection Inference. International Journal of Mathematics and Computation, 21, 51-64.
[30]
Nguefack-Tsague, G. (2013) An Alternative Derivation of Some Commons Distributions Functions: A Post-Model Selection Approach. International Journal of Applied Mathematics and Statistics, 42, 138-147.
[31]
Nguefack-Tsague, G. (2014) Estimation of a Multivariate Mean under Model Selection Uncertainty. Pakistan Journal of Statistics and Operation Research, 10, 131-145. http://dx.doi.org/10.18187/pjsor.v10i1.449
[32]
Nguefack-Tsague, G. (2014) On Optimal Weighting Scheme in Model Averaging. American Journal of Applied Mathematics and Statistics, 2, 150-156. http://dx.doi.org/10.12691/ajams-2-3-9
[33]
R Development Core Team (2016) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna.
[34]
Schomaker, M. and Heumann, C. (2014) Model Selection and Model Averaging after Multiple Imputation. Computational Statistics and Data Analysis, 77, 758-770. http://dx.doi.org/10.1016/j.csda.2013.02.017
Buckland, S.T., Burnham, K.P. and Augustin, N.H. (1997) Model Selection: An Integral Part of Inference. Biometrics, 53, 603-618. http://dx.doi.org/10.2307/2533961