In this work, a comprehensive framework for traditional outlier detection techniques based on simple and multiple linear regression models was studied. Two data sets were used for the illustration and evaluation of each class of outlier detection techniques (analytical and graphical methods). Outlier detection aims at identifying such outlier in order to improve the analytic of data and suitable model built. Furthermore, comparisons of the different methods were done to highlight the advantages, disadvantages and performance issues of each class of outlier detection techniques. The results show that by removing the influential points (or outliers), the model adequacy increased (from R2 = 0.72 to R2 = 0.97). It was observed that Jackknife residuals and Atkinson’s measure methods are very useful in detecting outliers; hence, both methods were recommended for outliers’ detection.
Bollen, K.A. and Jackman, R.W. (1990) Regression Diagnostics: An Expository Treatment of Outliers and Influential Cases. In: Fox, J. and Scott, L.J., Eds., Modern Methods of Data Analysis, Sage, Newbury Park, 257-291.
Abuzaid, A.H., Hussin, A.G. and Mohamed, I.B. (2008) Identifying Single Outlier in Linear Circular Regression Model Based on Circular Distance. Journal of Applied Probability and Statistics, 3, 107-117.
Zhang, Y., Meratnia, N. and Havinga, P.J.M. (2010) Outlier Detection Techniques for Wireless Sensor Networks Survey. IEEE Communication Survey and Tutorial, 12, 159-170. https://doi.org/10.1109/SURV.2010.021510.00088
Sebert, D.M., Montgomery, D.C. and Rollier, D.A. (1998) Clustering Algorithm for Identifying Multiple Outliers in Linear Regression. Computational Statistics and Data Analysis, 27, 461-484. https://doi.org/10.1016/S0167-9473(98)00021-8