All Title Author
Keywords Abstract


Using Hybrid and Diversity-Based Adaptive Ensemble Method for Binary Classification

DOI: 10.4236/ijis.2018.83003, PP. 43-74

Keywords: Bigdata Analytics, Machine Learning, Adaptive Ensemble Methods, Binary Classification

Full-Text   Cite this paper   Add to My Lib

Abstract:

This paper proposes an adaptive and diverse hybrid-based ensemble method to improve the performance of binary classification. The proposed method is a non-linear combination of base models and the application of adaptive selection of the most suitable model for each data instance. Ensemble method, an important machine learning technique uses multiple single models to construct a hybrid model. A hybrid model generally performs better compared to a single individual model. In a given dataset the application of diverse single models trained with different machine learning algorithms will have different capabilities in recognizing patterns in the given training sample. The proposed approach has been validated on Repeat Buyers Prediction dataset and Census Income Prediction dataset. The experiment results indicate up to 18.5% improvement on F1 score for the Repeat Buyers dataset compared to the best individual model. This improvement also indicates that the proposed ensemble method has an exceptional ability of dealing with imbalanced datasets. In addition, the proposed method outperforms two other commonly used ensemble methods (Averaging and Stacking) in terms of improved F1 score. Finally, our results produced a slightly higher AUC score of 0.718 compared to the previous result of AUC score of 0.712 in the Repeat Buyers competition. This roughly 1% increase AUC score in performance is significant considering a very big dataset such as Repeat Buyers.

References

[1]  David, O. and Maclin, R. (1999) Popular Ensemble Methods: An Empirical Study. Journal of Artificial Intelligence Research, 11, 169-198.
https://doi.org/10.1613/jair.614
[2]  Kuncheva, L.I. (2005) Diversity in Multiple Classifier Systems. Information Fusion, 6, 3-4.
https://doi.org/10.1016/j.inffus.2004.04.009
[3]  Canuto, A.M.P., et al. (2005) Performance and Diversity Evaluation in Hybrid and Non-Hybrid Structures of Ensembles. Fifth International Conference on Hybrid Intelligent Systems, Rio de Janeiro, 6-9 November 2005, 6.
https://doi.org/10.1109/ICHIS.2005.87
[4]  Canuto, A.M.P., et al. (2006) Using Weighted Dynamic Classifier Selection Methods in Ensembles with Different Levels of Diversity. International Journal of Hybrid Intelligent Systems, 3, 147-158.
https://doi.org/10.3233/HIS-2006-3303
[5]  Repeat Buyers Prediction Competition.
http://ijcai-15.org/index.php/repeat-buyers-prediction-competition
[6]  Archive.ics.uci.edu. (2016) UCI Machine Learning Repository.
http://archive.ics.uci.edu/ml/
[7]  Fan, X., Lung, C.-H. and Ajila, S.A. (2017) An Adaptive Diversity-Based Ensemble Method for Binary Classification. IEEE 41th Annual International Computers, Software & Applications Conference (COMPSAC’17), Torino, 4-8 July 2017.
https://doi.org/10.1109/COMPSAC.2017.49
[8]  David, B. (2012) Bayesian Reasoning and Machine Learning. Cambridge University Press, New York.
[9]  Sariel, H.-P., Roth, D. and Zimak, D. (2002) Constraint Classification: A New Approach to Multiclass Classification. International Conference on Algorithmic Learning Theory. Springer, Berlin Heidelberg.
[10]  Ethem, A. (2010) Introduction to Machine Learning. The MIT Press, Cambridge, 249-256.
[11]  Friedman, J.H. (2001) Greedy Function Approximation: A Gradient Boosting Machine. Annals of Statistics, 29, 1189-1232.
https://doi.org/10.1214/aos/1013203451
[12]  Ron, K. and Provost, F. (1998) Glossary of Terms. Machine Learning, 30, 271-274.
https://doi.org/10.1023/A:1007411609915
[13]  Tom, F. (2006) An Introduction to ROC Analysis. Pattern Recognition Letters, 27, 861-874.
https://doi.org/10.1016/j.patrec.2005.10.010
[14]  Wikipedia (2016) Receiver Operating Characteristic.
https://en.wikipedia.org/wiki/Receiver_operating_characteristic
[15]  William, C., Ravikumar, P. and Fienberg, S. (2003) A Comparison of String Metrics for Matching Names and Records. Kdd Workshop on Data Cleaning and Object Consolidation, 3.
[16]  Jesse, D. and Goadrich, M. (2006) The Relationship between Precision-Recall and ROC Curves. Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, 25-29 June 2006, 233-240.
https://doi.org/10.1145/1143844.1143874
[17]  Breiman, L. (1996) Bagging Predictors. Machine Learning, 24, 123-140.
https://doi.org/10.1007/BF00058655
[18]  Freund, Y., et al. (1997) Using and Combining Predictors That Specialize. Proceedings of the 29th Annual ACM Symposium on Theory of Computing, El Paso, 4-6 May 1997, 334-343.
https://doi.org/10.1145/258533.258616
[19]  Friedman, N., Geiger, D. and Goldszmidt, M. (1997) Bayesian Network Classifiers. Machine Learning, 29, 131-163.
[20]  Chen, T. and Guestrin, C. (2016) Xgboost: A Scalable Tree Boosting System.
[21]  Bergstra, J., et al. (2006) Aggregate Features and AdaBoost for Music Classification. Machine Learning, 65, 473-484.
[22]  Wolpert, D.H. (1992) Stacked Generalization. Neural Networks, 5, 241-259.
https://doi.org/10.1016/S0893-6080(05)80023-1
[23]  Ham, J., et al. (2005) Investigation of the Random Forest Framework for Classification of Hyperspectral Data. IEEE Transactions on Geoscience and Remote Sensing, 43, 492-501.
https://doi.org/10.1109/TGRS.2004.842481
[24]  Atkinson, E.J., et al. (2012) Assessing Fracture Risk Using Gradient Boosting Machine (GBM) Models. Journal of Bone and Mineral Research, 27, 1397-1404.
https://doi.org/10.1002/jbmr.1577
[25]  Ghorbani, A.A. and Owrangh, K. (2001) Stacked Generalization in Neural Networks: Generalization on Statistically Neutral Problems. Proceedings of the International Joint Conference Neural Networks, Washington DC, 15-19 July 2001, 1715-1720.
[26]  Wang, S.-Q., Yang, J. and Chou, K.-C. (2006) Using Stacked Generalization to Predict Membrane Protein Types Based on Pseudo-Amino Acid Composition. Journal of Theoretical Biology, 242, 941-946.
https://doi.org/10.1016/j.jtbi.2006.05.006
[27]  Tsymbal, A., et al. (2008) Dynamic Integration of Classifiers for Handling Concept Drift. Information Fusion, 9, 56-68.
https://doi.org/10.1016/j.inffus.2006.11.002
[28]  Bhatnagar, V., et al. (2014) Accuracy-Diversity Based Pruning of Classifier Ensembles. Progress in Artificial Intelligence, 2, 97-111.
[29]  Giacinto, G. and Roli, F. (2001) Dynamic Classifier Selection Based on Multiple Classifier Behaviour. Pattern Recognition, 34, 1879-1881.
https://doi.org/10.1016/S0031-3203(00)00150-3
[30]  Liu, G., et al. (2015) Report for Repeated Buyer Prediction Competition by Team 9*STAR*. Proceedings of the 1st International Workshop on Social Influence Analysis Co-Located with 24th International Joint Conference on Artificial Intelligence (IJCAI 2015), Buenos Aires, 27 July 2015.
[31]  He, B., et al. (2015) Repeat Buyers Prediction after Sales Promotion for Tmall Platform. Proceedings of the 1st International Workshop on Social Influence Analysis co-located with 24th International Joint Conference on Artificial Intelligence (IJCAI 2015), Buenos Aires, Argentina, 27 July 2015.
[32]  Domingos, P. and Pazzani, M. (1997) On the Optimality of the Simple Bayesian Classifier under Zero-One Loss. Machine Learning, 29, 103-130.
[33]  Schutt, R. and O’Neil, C. (2013) Doing Data Science: Straight Talk from the Frontline. O’Reilly Media, Inc.

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

微信:OALib Journal