In this paper, the problem of Nonparametric Estimation of Finite
Population Totals in high dimensional datasets is considered. A robust
estimator of the Finite Population Total based on Feedforward Backpropagation
Neural Network is derived with the aid of a Super-Population Model. This
current study is motivated by the fact that Local Polynomials and Kernel
methods have in preceding related studies, been shown to provide good
estimators for Finite Population Totals but in low dimensions. Even in these
situations however, bias at boundary points presents a big challenge when using
these estimators in estimating Finite Population parameters. The challenge
worsens as the dimension of regressors increase. This is because as the
dimension of the Regressor Vectors grows, the Sparseness of the Regressors’ values
in the design space becomes unfeasible, resulting in a decrease in the fastest
achievable rates of convergence of the Regression Function Estimators towards
the target curve, rendering Kernel Methods and Local Polynomials ineffective to
address these challenges. This study considers the technique of Artificial Neural Networks which
yields robust estimators in high dimensions and reduces the estimation bias
with marginal increase in variance. This is due to its Multi-Layer Structure,
which can approximate a wide range of functions to any required level of
precision. The estimator’s properties are developed, and a comparison with
existing estimators was conducted to evaluate the estimator’s performance using
real data sets acquired from the United Nations Development Programme 2020. The
estimation approach performs well in an example using data from a United
Nations Development Programme 2020 on the study of Human Development Index
against other factors. The theoretical and practical results imply that the
Neural Network estimator is highly recommended for survey sampling estimation
of the finite population total.
References
[1]
Chambers, R.L. and Dunstan, R. (1986) Estimating Distribution Functions from Survey Data. Biometrika, 73, 597-604. https://doi.org/10.1093/biomet/73.3.597
[2]
Wang, S.J. and Dorfman, A.H. (1996) A New Estimator for the Finite Population Distribution Function. Biometrika, 83, 639-652.
https://doi.org/10.1093/biomet/83.3.639
[3]
Hansen, M.H., et al. (1987) Some History and Reminiscences on Survey Sampling. Statistical Science, 2, 180-190. https://doi.org/10.1214/ss/1177013352
[4]
Dorfman, A.H. (1992) Nonparametric Regression for Estimating Totals in Finite Populations. In: Proceedings of the Section on Survey Research Methods, American Statistical Association Alexandria, 622-625.
[5]
Otieno, R.O. and Mwalili, T.M. (2000) Nonparametric Regression Method for Estimating the Error Variance in Unistage Sampling.
[6]
Jay Breidt, F. and Opsomer, J.D. (2000) Local Polynomial Regression Estimators in Survey Sampling. Annals of Statistics, 28, 1026-1053.
https://doi.org/10.1214/aos/1015956706
[7]
Hardle, W. and Linton, O. (1994) Applied Nonparametric Methods. In: Engle, R.F. and McFadden, D., Eds., Handbook of Econometrics, Vol. 4, Elsevier, Amsterdam, 2295-2339. https://doi.org/10.1016/S1573-4412(05)80007-8
[8]
Chambers, R.L., Dorfman, A.H. and Hall, P. (1992) Properties of Estimators of the Finite Population Distribution Function. Biometrika, 79, 577-582.
https://doi.org/10.1093/biomet/79.3.577
[9]
Montanari, G.E. and Ranalli, M.G. (2003) On Calibration Methods for Design Based Finite Population Inferences. Bulletin of the International Statistical Institute, 60, 2 p.
[10]
Stone, C.J. (1982) Optimal Global Rates of Convergence for Nonparametric Regression. The Annals of Statistics, 10, 1040-1053.
https://doi.org/10.1214/aos/1176345969
[11]
Bickel, P.J. and Li, B. (2007) Local Polynomial Regression on Unknown Manifolds. Institute of Mathematical Statistics, Beachwood, Lecture Notes—Monograph Series, 177-186. https://doi.org/10.1214/074921707000000148
[12]
Friedman, J.H. (1991) Multivariate Adaptive Regression Splines. The Annals of Statistics, 19, 1-67. https://doi.org/10.1214/aos/1176347963
[13]
Di Ciaccio, A. and Montanari, G.E. (2001) A Nonparametric Regression Estimator of a Finite Population Mean. In: Book Short Papers CLADAG, Istituto di Statistica, Università degli Studi di Palermo, Palermo, 173-176.
[14]
Opsomer, J.D., Jay Breidt, F., Moisen, G.G. and Kauermann, G. (2007) Model Assisted Estimation of Forest Resources with Generalized Additive Models. Journal of the American Statistical Association, 102, 400-409.
https://doi.org/10.1198/016214506000001491
[15]
El-Housseiny, A.R. and Ziedan, D. (2014) Estimation of Population Total Using Nonparametric Regression Models. Advances and Applications in Statistics, 39, 37-59.
[16]
Barron, A.R. (1993) Universal Approximation Bounds for Superpositions of a Sigmoidal Function. IEEE Transactions on Information Theory, 39, 930-945.
https://doi.org/10.1109/18.256500
[17]
Ken-Ichi, F. (1989) On the Approximate Realization of Continuous Mappings by Neural Networks. Neural Networks, 2, 183-192.
https://doi.org/10.1016/0893-6080(89)90003-8
[18]
Cybenko, G. (1989) Approximation by Superpositions of a Sigmoidal Function. Mathematics of Control, Signals and Systems, 2, 303-314.
https://doi.org/10.1007/BF02551274
[19]
White, H. (1990) Connectionist Nonparametric Regression: Multilayer Feed forward Networks Can Learn Arbitrary Mappings. Neural Networks, 3, 535-549.
https://doi.org/10.1016/0893-6080(90)90004-5
[20]
Franke, J. and Neumann, M.H. (2000) Bootstrapping Neural Networks. Neural Computation, 12, 1929-1949. https://doi.org/10.1162/089976600300015204
[21]
Gene Hwang, J.T. and Ding, A.A. (1997) Prediction Intervals for Artificial Neural Networks. Journal of the American Statistical Association, 92, 748-757.
https://doi.org/10.1080/01621459.1997.10474027
[22]
Barron, A.R. (1994) Approximation and Estimation Bounds for Artificial Neural Networks. Machine Learning, 14, 115-133. https://doi.org/10.1007/BF00993164
[23]
Asnaashari, A., McBean, E.A., Gharabaghi, B. and Tutt, D. (2013) Forecasting Watermain Failure Using Artificial Neural Network Modelling. Canadian Water Resources Journal, 38, 24-33. https://doi.org/10.1080/07011784.2013.774153
[24]
Grenander, U. and Ulf, G. (1981) Abstract Inference. Technical Report.
[25]
Geman, S. and Hwang, C.-R. (1982) Nonparametric Maximum Likelihood Estimation by the Method of Sieves. The Annals of Statistics, 10, 401-414.
https://doi.org/10.1214/aos/1176345782
[26]
Franke, J. and Diagne, M. (2006) Estimating Market Risk with Neural Networks. Statistics & Decisions, 24, 233-253. https://doi.org/10.1524/stnd.2006.24.2.233
[27]
Shen, X.X., Jiang, C., Sakhanenko, L. and Lu, Q. (2019) Asymptotic Properties of Neural Network Sieve Estimators.
[28]
Serfling, R.J. (1980, 2000) Approximation Theorems of Mathematical Statistics. John Wiley & Sons, Inc., Hoboken. https://doi.org/10.1002/9780470316481
[29]
Serfling, R.J. (2009) Approximation Theorems of Mathematical Statistics, Volume 162. John Wiley & Sons, Hoboken.
[30]
Liang, F.M. and Kuk, Y.C.A. (2004) A Finite Population Estimation Study with Bayesian Neural Networks. Survey Methodology, 30, 219-234.