全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Ground Ozone Level Prediction Using Machine Learning

DOI: 10.4236/jsea.2019.1210026, PP. 423-431

Keywords: Ground Ozone Pollution, Machine Learning, Classification, Logistic Regression, Decision Tree, Random Forest, AdaBoost, Support Vector Machine

Full-Text   Cite this paper   Add to My Lib

Abstract:

Because of the increasing attention on environmental issues, especially air pollution, predicting whether a day is polluted or not is necessary to people’s health. In order to solve this problem, this research is classifying ground ozone level based on big data and machine learning models, where polluted ozone day has class 1 and non-ozone day has class 0. The dataset used in this research was derived from the UCI Website, containing various environmental factors in Houston, Galveston and Brazoria area that could possibly affect the occurrence of ozone pollution [1]. This dataset is first filled up for further process, next standardized to ensure every feature has the same weight, and then split into training set and testing set. After this, five different machine learning models are used in the prediction of ground ozone level and their final accuracy scores are compared. In conclusion, among Logistic Regression, Decision Tree, Random Forest, AdaBoost, and Support Vector Machine (SVM), the last one has the highest test score of 0.949. This research utilizes relatively simple methods of forecasting and calculates the first accuracy scores in predicting ground ozone level; it can thus be a reference for environmentalists. Moreover, the direct comparison among five different models provides machine learning field an insight to determine the most accurate model. In the future, Neural Network can also be utilized to predict air pollution, and its test scores can be compared with the previous five methods to conclude the accuracy of Neuron Network.

References

[1]  Dua, D. and Graff, C. (2019) UCI Machine Learning Repository. School of Information and Computer Science, University of California, Irvine, CA.
http://archive.ics.uci.edu/ml
[2]  Jolliffe, I. (2011) Principal Component Analysis. Springer, Berlin Heidelberg.
https://doi.org/10.1007/978-3-642-04898-2_455
[3]  Li, S.S. (2019) Building A Logistic Regression in Python, Step by Step. Medium, Towards Data Science, 27 Feb.
https://towardsdatascience.com/building-a-logistic-regression-in-python-step-by-step-becd4d56c9c8
[4]  AdaBoost Classifier in Python. DataCamp Community.
https://www.datacamp.com/community/tutorials/adaboost-classifier-python
[5]  Decision Tree Classification in Python. DataCamp Community.
https://www.datacamp.com/community/tutorials/decision-tree-classification-python
[6]  Random Forests Classifiers in Python. DataCamp Community.
https://www.datacamp.com/community/tutorials/random-forests-classifier-python
[7]  Support Vector Machines in Scikit-Learn. DataCamp Community.
https://www.datacamp.com/community/tutorials/svm-classification-scikit-learn-python

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133