全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Development of a Modelling Script of Time Series Suitable for Data Mining

DOI: 10.4236/ojs.2016.64047, PP. 555-564

Keywords: Data Mining, ARIMA Models, Time Series, Script, R

Full-Text   Cite this paper   Add to My Lib

Abstract:

Data Mining has become an important technique for the exploration and extraction of data in numerous and various research projects in different fields (technology, information technology, business, the environment, economics, etc.). In the context of the analysis and visualisation of large amounts of data extracted using Data Mining on a temporary basis (time-series), free software such as R has appeared in the international context as a perfect inexpensive and efficient tool of exploitation and visualisation of time series. This has allowed the development of models, which help to extract the most relevant information from large volumes of data. In this regard, a script has been developed with the goal of implementing ARIMA models, showing these as useful and quick mechanisms for the extraction, analysis and visualisation of large data volumes, in addition to presenting the great advantage of being applied in multiple branches of knowledge from economy, demography, physics, mathematics and fisheries among others. Therefore, ARIMA models appear as a Data Mining technique, offering reliable, robust and high-quality results, to help validate and sustain the research carried out.

References

[1]  IBM (2015). www-01.ibm.com/software/data/bigdata/what-is-big-data.html
[2]  Einav, L. and Levin, J. (2014) Economics in the Age of Big Data. Science, 346, 715-721.
http://dx.doi.org/10.1126/science.1243089
[3]  Lazer, D., Kennedy, R., King, G. and Vespignani, A. (2014) The Parable of Google Flu: Traps in Big Data Analysis. Science, 343, 1203-1205.
http://dx.doi.org/10.1126/science.1248506
[4]  Fan, C., Xiao, F., Madsen, H. and Wang, D. (2015) Temporal Knowledge Discovery in Big BAS Data for Building Energy Management. Energy and Buildings, 109, 75-89.
http://dx.doi.org/10.1016/j.enbuild.2015.09.060
[5]  Vera-Baquero, A., Colomo-Palacios, R. and Molloy, O. (2016) Real-Time Business Activity Monitoring and Analysis of Process Performance on Big-Data Domains. Telematics and Informatics, 33, 793-807.
http://dx.doi.org/10.1016/j.tele.2015.12.005
[6]  Krishnan, K. (2013) Data Warehousing in the Age of Big Data. Newnes, Boston.
[7]  Inmon, W.H. and Linstedt, D. (2015) Data Architecture: A Primer for the Data Scientist. Morgan Kaufmann, Boston.
[8]  Rathod, R.R. and Garg, R.D. (2016) Regional Electricity Consumption Analysis for Consumers Using Data Mining Techniques and Consumer Meter Reading Data. Electrical Power and Energy Systems, 78, 368-374.
http://dx.doi.org/10.1016/j.ijepes.2015.11.110
[9]  Zhang, Z., Kusiak, A., Zeng, Y. and Wei, X. (2016) Modeling and Optimization of a Wastewater Pumping System with Data-Mining Methods. Applied Energy, 164, 303-311.
http://dx.doi.org/10.1016/j.apenergy.2015.11.061
[10]  Shaheen, M. and Khan, M.Z. (2016) A Method of Data Mining for Selection for Wind Turbines. Renewable and Sustainable Energy Reviews, 55, 1225-1233.
http://dx.doi.org/10.1016/j.rser.2015.04.015
[11]  Box, G.E.P. and Jenkins, G.M. (1976) Time Series Analysis, Forecasting and Control. Holden-Day, San Francisco.
[12]  Batarseh, F.A. and Latif, E.A. (2015) Assessing the Quality of Service Using Big Data Analytics: With Application to Healthcare. Big Data Research, 4, 13-24.
http://dx.doi.org/10.1016/j.bdr.2015.10.001
[13]  Legates, M.J. (1999) Evaluating the Use of Goodness of Fit Measures in Hydrologic and Hydroclimatic Model Validation. Water Resources Research, 35, 233-241.
http://dx.doi.org/10.1029/1998WR900018
[14]  Abrahart, R.J. and See, L. (2000) Comparing Neural Network and Autoregressive Moving Average Techniques for the Provision of Continuous River Flow Forecasts in Two Contrasting Catchments. Hydrological Processes, 14, 2157-2172.
http://dx.doi.org/10.1002/1099-1085(20000815/30)14:11/12<2157::AID-HYP57>3.0.CO;2-S
[15]  R Documentation (2016) ARIMA Modelling of Time Series.
https://stat.ethz.ch/R-manual/R-devel/library/stats/html/arima.html
[16]  Dickey, D.A. and Fuller, W.A. (1979) Distribution of the Estimators for Autoregressive Time Series with a Unit Root. Journal of the American Statistical Association, 74, 427-431.
[17]  Kwiatkowski, D., Phillips, P.C.B., Schmidt, P. and Shinb, Y. (1992) Testing the Null Hypothesis of Stationary against the Alternative of a Unit Root. Journal of Econometrics, 54, 159-178.
http://dx.doi.org/10.1016/0304-4076(92)90104-Y
[18]  Breusch, T.S. and Pagan, A.R. (1979) A Simple Test for Heteroscedasticity and Random Coefficient Variation. Econometrica, 47, 1287-1294.
http://dx.doi.org/10.2307/1911963
[19]  Ljung, G.M. and Box, G.E.P. (1978) On a Measure of Lack of Fit in Time Series Models. Biometrika, 65, 297-303.
http://dx.doi.org/10.1093/biomet/65.2.297
[20]  Chatfield, C. (2013) The Analysis of Time Series: An Introduction. CRC Press, Boca Raton.
[21]  Parreno, J., De la Fuente, D., Gómez, A. and Fernández, I. (2003) Previsión en el sector turístico en Espana con las metodologías Box-Jenkins y Redes neuronales. XIII Congreso Nacional ACEDE, Salamanca, Espana.
[22]  Holton, J. and Keating, B. (1996) Previsiones en los negocios. Irwin, Madrid.
[23]  Steel, R.G.D. and Torrie, J.H. (1960) Principles and Procedures of Statistics with Special Reference to the Biological Sciences. McGraw Hill, New York, 187-287.
[24]  Ventura, S., Silva, M., Pérez-Bendito, D. and Hervas, C. (1995) Artificial Neural Networks for Estimation of Kinetic Analytical Parameters. Analytical Chemistry, 67, 1521-1525.
http://dx.doi.org/10.1021/ac00105a007
[25]  Nash, J.E. and Sutcliffe, J.V. (1970) River Flow Forecasting through Conceptual Models Part I-A Discussion of Principles. Journal of Hydrology, 10, 282-290.
http://dx.doi.org/10.1016/0022-1694(70)90255-6
[26]  Kitanidis, P.K. and Bras, R.L. (1980) Real-Time Forecasting with a Conceptual Hydrologic Model: 2. Applications and Results. Water Resources Research, 16, 1034-1044.
http://dx.doi.org/10.1029/WR016i006p01034
[27]  Grinó, R. (1992) Neural Networks for Univariate Time Series Forecasting and Their Application to Water Demand Prediction. Neural Network World, 2, 437-450.
[28]  Akaike, H. (1974) A New Look at the Statistical Identification Model. IEEE Transactions on Automatic Control, 19, 716-723.
http://dx.doi.org/10.1109/TAC.1974.1100705
[29]  Motulsky, H.J. and Christopoulos, A. (2003) Fitting Models to Biological Data Using Linear and Nonlinear Regression. GraphPad Software Inc., San Diego, 351 p.
[30]  Diebold, F. (1999) Elementos de Pronósticos. International Thomson Editores, México, 106-128 p.
[31]  Giraldo Gómez, N.D. (2006) Series de Tiempo con R. Universidad Nacional de Colombia, Colombia.
[32]  Gaona, B. (2005) Matrices de covarianza estructuradas en modelos con medidas repericas. Tesis de maestría, Mayagüez, Puerto Rico.
[33]  Guyet, T. and Nicolas, H. (2016) Long Term Analysis of Time Series of Satellite Images. Pattern Recognition Letters, 70, 17-23.
http://dx.doi.org/10.1016/j.patrec.2015.11.005
[34]  Siluyele, I. and Jere, S. (2016) Using Box-Jenkins Models to Forecast Mobile Cellular Subscription. Open Journal of Statistics, 6, 303-309.
http://dx.doi.org/10.4236/ojs.2016.62026
[35]  Czerwinski, I.A., Gutiérrez-Estrada, J.C. and Hernando-Casal, J.A. (2007) Short-Term Forecasting of Halibut CPUE: Linear and Non-Linear Univariate Approaches. Fisheries Research, 86, 120-128.
http://dx.doi.org/10.1016/j.fishres.2007.05.006
[36]  Jere, S. and Moyo, E. (2016) Modelling Epidemiological Data Using Box-Jenkins Procedure. Open Journal of Statistics, 6, 295-302.
http://dx.doi.org/10.4236/ojs.2016.62025
[37]  Arnau, J. (1981) Uso de los modelos de series temporales como técnica de análisis de los dise?os conductuales. Anuario de psicología, 25, 20-34.
[38]  Maté Jiménez, C. (2014) Big data.Un nuevo paradigma de análisis de datos. Anales de mecánica y electricidad, 10-16.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133