Estimating the Variance of the Proportion of Contaminated Soil by Petroleum Spills Using Two-Dimensional Systematic Sampling under Different Approaches
In
leading petroleum-producing countries like Kuwait, Brazil, Iran, Iraq
and Mexico oil spills frequently occur on land, causing serious damage to crop
fields. Soil remediation requires constant monitoring of the polluted area. One
common monitoring method involves two-dimensional systematic sampling, which
can be used to estimate the proportion of the contaminated soil and study the
oil spills’ geographic distribution. A well-known issue using this sampling
design involves the analytical derivation of variance of the sample mean
(proportion), which requires at least two independent samples. To address the
problem, this research proposed a variance estimator based on regression and a
corrected estimator using the autocorrelation Geary Index under the
model-assisted approach. The construction of the estimators was assisted by
geo-statistical models by simulating an auxiliary variable. Similar populations
to those in real oil spills were recreated, and the accuracy of proposed
estimators was evaluated by comparing their performance with other well-known
estimators. The factors considered in this simulation study were: a) the model
for simulating the populations (exponential and wave), b) the mean and the
variance of the process, c) the level of autocorrelation among units. Given the
statistical and computing burdens (bias, ratio between estimated and real
variance, convergence and computer time), under the exponential model, the
regression estimator showed the best performance; and for the wave model, the
corrected version performed even better.
References
[1]
Cochran, W.G. (1997) Sampling Techniques. 3rd Edition, John Wiley & Sons, New York.
[2]
Wolter, K.M. (1985) Introduction to Variance Estimation. Springer-Verlag, New York.
[3]
Marcello, D. (2003) Estimating the Variance of the Sample Mean in Two-Dimensional Systematic Sampling. Journal of Agriculture, Biological and Environmental Statistics, 8, 280-295. https://doi.org/10.1198/1085711032174
[4]
Li, X. (2006) Application of Nonparametric Regression in Survey Statistics. Ph.D. Thesis, Iowa State University, Ames, Iowa.
[5]
Opsomer, J.D., Francisco-Fernández, M. and Li, X. (2012) Model-Based Non-Parametric Variance Estimation for Systematic Sampling. Scandinavian Journal of Statistics, 39, 528-542. https://doi.org/10.1111/j.1467-9469.2011.00773.x
[6]
Aubry, P. and Debouzie, D. (2000) Geoestatistical Estimation Variance for the Spatial Mean in Two-Dimensional Systematic Sampling. Ecology, 81, 543-553.
https://doi.org/10.1890/0012-9658(2000)081[0543:GEVFTS]2.0.CO;2
[7]
Sarndal, C.E., Swensson, B. and Wretman, J. (1992) Model Assisted Survey Sampling. Springer, New York.
[8]
Strand, G.H. (2017) A Study of Variance Estimation Methods for Systematic Spatial Sampling. Spatial Statistics, 21, 226-240.
https://doi.org/10.1016/j.spasta.2017.06.008
[9]
Geary, R.C. (1954) The Contiguity Ratio and Statistical Mapping. The Icorporporated Statistician, 5, 115-145. https://doi.org/10.2307/2986645
[10]
Lehtonen, R. and Pahkinen, E. (2004) Practical Methods for Design and Analysis of Complex Surveys. 2nd Edition, John Wiley & Sons Ltd, Chichester, 349 p.
[11]
Sarndal, C.E. (1978) Design-Based and Model Based Inference in Survey Sampling. Scandinavian Journal of Statistics, 5, 27-52.
[12]
Thompson, M.E. (1997) Sampling. John Wiley & Sons, New York.
[13]
Aubry, P. and Debouzie, D. (2001) Estimation of the Mean from a Two-Dimensional Sample: The Geostatistical Model-Based Approach. Ecology, 82, 1484-1494.
https://doi.org/10.1890/0012-9658(2001)082[1484:EOTMFA]2.0.CO;2
[14]
Isaaks, E.H. and Srivastava, R.M. (1989) Applied Geostatistics. Oxford University Press.
[15]
Chauvet, P. (1993) Processing Data with a Spatial Support: Geostatistics and Its Methods. Cahiers de Géostatistique, Fasc. 4, Centre de Géostatistique, Fontainebleau, France.
[16]
Webster, R. and Oliver, M.A. (2001) Geostatistics for Environmental Scientists. John Wiley & Sons Ltd., Chichester.
[17]
R Development Core Team (2005) R: A Language and Environment for Statistical Computing, Reference Index Version 2.3.1. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org.
[18]
Diario Oficial de la Federación (2002) Norma Oficial Mexicana de Emergencia NOM-EM-138-ECOL-2002, que establece los límites máximos permisibles de contaminación en suelos afectados por hidrocarburos, la caracterización del sitio y procedimientos para la restauración.