Credit card fraud remains a significant challenge, with financial losses and consumer protection at stake. This study addresses the need for practical, real-time fraud detection methodologies. Using a Kaggle credit card dataset, I tackle class imbalance using the Synthetic Minority Oversampling Technique (SMOTE) to enhance modeling efficiency. I compare several machine learning algorithms, including Logistic Regression, Linear Discriminant Analysis, K-nearest Neighbors, Classification and Regression Tree, Naive Bayes, Support Vector, Random Forest, XGBoost, and Light Gradient-Boosting Machine to classify transactions as fraud or genuine. Rigorous evaluation metrics, such as AUC, PRAUC, F1, KS, Recall, and Precision, identify the Random Forest as the best performer in detecting fraudulent activities. The Random Forest model successfully identifies approximately 92% of transactions scoring 90 and above as fraudulent, equating to a detection rate of over 70% for all fraudulent transactions in the test dataset. Moreover, the model captures more than half of the fraud in each bin of the test dataset. SHAP values provide model explainability, with the SHAP summary plot highlighting the global importance of individual features, such as “V12” and “V14”. SHAP force plots offer local interpretability, revealing the impact of specific features on individual predictions. This study demonstrates the potential of machine learning, particularly the Random Forest model, for real-time credit card fraud detection, offering a promising approach to mitigate financial losses and protect consumers.
References
[1]
Delamaire, L., Abdou, H. and Pointon, J. (2009) Credit Card Fraud and Detection Techniques: A Review. Banks and Bank Systems, 4, 57-68.
[2]
Capital One (2023) How to Spot and Avoid Credit Card Skimmers. https://www.capitalone.com/learn-grow/privacy-security/credit-card-skimmers/
[3]
Barker, K.J., D’Amato, J. and Sheridon, P. (2008) Credit Card Fraud: Awareness and Prevention. Journal of Financial Crime, 15, 398-410. https://doi.org/10.1108/13590790810907236
[4]
Consumer Sentinel Network Data Book 2022. Federal Trade Commission. https://ftc.gov/
[5]
New FTC Data Show Consumers Reported Losing Nearly $8.8 Billion to Scams in 2022. Federal Trade Commission.
[6]
Federal Bureau of Investigation (2022) Federal Bureau of Investigation Internet Crime Report. https://www.ic3.gov/Media/PDF/AnnualReport/2022_IC3Report.pdf
[7]
Thennakoon, A., Bhagyani, C., Premadasa, S., Mihiranga, S. and Kuruwitaarachchi, N. (2019) Real-Time Credit Card Fraud Detection Using Machine Learning. 20199th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, 10-11 January 2019, 488-493. https://ieeexplore.ieee.org/document/8776942 https://doi.org/10.1109/CONFLUENCE.2019.8776942
[8]
Abakarim, Y., Lahby, M. and Attioui, A. (2018) An Efficient Real Time Model for Credit Card Fraud Detection Based On Deep Learning. Proceedings of the 12th International Conference on Intelligent Systems: Theories and Applications, Rabat, 24-25 October 2018, 1-7. https://doi.org/10.1145/3289402.3289530
[9]
Yee, O.S., Sagadevan, S. and Ahamed, H. (2018) Credit Card Fraud Detection Using Machine Learning As Data Mining Technique. Journal of Telecommunication, Electronic and Computer Engineering, 10, 23-27.
[10]
Karunachandra, B., Putera, N. Wijaya, S.R., Suryani, D., Wesley, J. and Purnama, Y. (2023) On the Benefits of Machine Learning Classification in Cashback Fraud Detection. Procedia Computer Science, 216, 364-369. https://doi.org/10.1016/j.procs.2022.12.147
[11]
Varmedja, D., Karanovic, M., Sladojevic, S., Arsenovic, M. and Anderla, A. (2019) Credit Card Fraud Detection—Machine Learning methods. 201918th International Symposium INFOTEH-JAHORINA (INFOTEH), East Sarajevo, 20-22 March 2019, 1-5. https://doi.org/10.1109/infoteh.2019.8717766
[12]
Sailusha, R., Gnaneswar, V., Ramesh, R. and Rao, G.R. (2020) Credit Card Fraud Detection Using Machine Learning. 20204th International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, 13-15 May 2020, 1264-1270. https://ieeexplore.ieee.org/abstract/document/9121114 https://doi.org/10.1109/ICICCS48265.2020.9121114
[13]
Raghavan, P. and Gayar, N.E. (2019) Fraud Detection Using Machine Learning and Deep Learning. 2019 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE), Dubai, 11-12 December 2019, 334-339. https://ieeexplore.ieee.org/document/9004231 https://doi.org/10.1109/ICCIKE47802.2019.9004231
[14]
Dornadula, V.N. and Geetha, S. (2019) Credit Card Fraud Detection Using Machine Learning Algorithms. Procedia Computer Science, 165, 631-641. https://doi.org/10.1016/j.procs.2020.01.057
[15]
Awoyemi, J.O., Adetunmbi, A.O. and Oluwadare, S.A. (2017) Credit Card Fraud Detection Using Machine Learning Techniques: A Comparative Analysis. 2017 International Conference on Computing Networking and Informatics (ICCNI), Lagos, 29-31 October 2017, 1-9. https://doi.org/10.1109/iccni.2017.8123782
[16]
Tran, P.H., Tran, K.P., Huong, T.T., Heuchenne, C., HienTran, P. and Le, T.M.H. (2018) Real Time Data-Driven Approaches for Credit Card Fraud Detection. Proceedings of the 2018 International Conference on E-Business and Applications, New York, 23-25 February 2018, 6-9. https://doi.org/10.1145/3194188.3194196
[17]
Batani, J. (2017) An Adaptive and Real-Time Fraud Detection Algorithm in Online Transactions. International Journal of Computer Science and Business Informatics, 17, 1-12.
[18]
Khan, A.U.S., Akhtar, N. and Qureshi, M.N. (2014) Real-Time Credit-Card Fraud Detection Using Artificial Neural Network Tuned by Simulated Annealing Algorithm. Proceedings of International Conference on Recent trends in Information, Telecommunication and Computing, ITC, Chandigarh, 113-121.
Visa (2023) Visa Provisioning Intelligence Launches to Combat Token Fraud. https://investor.visa.com/news/news-details/2023/Visa-Provisioning-Intelligence-Launches-to-Combat-Token-Fraud/default.aspx
[21]
Shao, Y., Cheng, Y., Shah, R.U., Weir, C.R., Bray, B.E. and Zeng-Treitler, Q. (2021) Shedding Light on the Black Box: Explaining Deep Neural Network Prediction of Clinical Outcomes. Journal of Medical Systems, 45, Article 5. https://doi.org/10.1007/s10916-020-01701-8
[22]
Linardatos, P., Papastefanopoulos, V. and Kotsiantis, S. (2020) Explainable AI: A Review of Machine Learning Interpretability Methods. Entropy, 23, Article 18. https://doi.org/10.3390/e23010018
[23]
Roscher, R., Bohn, B., Duarte, M.F. and Garcke, J. (2020) Explainable Machine Learning for Scientific Insights and Discoveries. IEEE Access, 8, 42200-42216. https://doi.org/10.1109/access.2020.2976199
[24]
Gilpin, L.H., Bau, D., Yuan, B.Z., Bajwa, A., Specter, M. and Kagal, L. (2018) Explaining Explanations: An Overview of Interpretability of Machine Learning. 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), Turin, 1-3 October 2018, 80-89. https://doi.org/10.1109/dsaa.2018.00018
[25]
Lundberg, S.M. and Lee, S.-I. (2017) A Unified Approach to Interpreting Model Predictions. Proceedings of the 31st International Conference on Neural Information Processing Systems, New York, 4-9 December 2017, 4768-4777. https://neurips.cc/