This study presents a comparative analysis of machine learning models for threat detection in Internet of Things (IoT) devices using the CICIoT2023 dataset. We evaluate Logistic Regression, K-Nearest Neighbors, and Random Forest algorithms across three classification granularities: binary (benign vs. attack), multi-class (8 categories), and fine-grained (34 subtypes). Our methodology incorporates comprehensive preprocessing including feature engineering, variance thresholding, correlation filtering, and dimensionality reduction. Performance assessment focuses on accuracy, precision, recall, and F1-score, along with model scalability when trained on small datasets and tested on larger ones. Results demonstrate that Random Forest consistently outperforms other models across all classification tasks (binary: F1 = 0.710, 8-class: F1 = 0.629, 34-class: F1 = 0.590). All models show performance degradation as classification granularity increases, with notable challenges in detecting BruteForce and Web attacks. Feature importance analysis reveals protocol-specific characteristics and TCP flag information as crucial for attack identification. Scalability testing indicates significant performance decline when models trained on limited data (0.1%) are applied to larger datasets (0.5%, 1%), though Random Forest demonstrates superior generalization capabilities. An unsupervised autoencoder approach achieves moderate success for anomaly detection (accuracy = 0.881) but struggles with recall (0.070). These findings highlight the trade-off between detection granularity and accuracy in IoT security implementations and suggest hierarchical classification approaches for resource-constrained environments. The study provides valuable guidance for selecting appropriate machine learning techniques for real-world IoT security applications.
Cite this paper
Kontagora, M. M. , Adeshina, S. A. , Musa, H. and Aimufua, G. I. O. (2025). An Evaluation of Machine Learning Models for Threat Classification in IoT Devices. Open Access Library Journal, 12, e3551. doi: http://dx.doi.org/10.4236/oalib.1113551.
Tawalbeh, L.C., Muheidat, R., Tawalbeh, A. and Quwaider, M. (2020) IoT Privacy and Security: Challenges and Solutions. Applied Sciences, 10, Article 4102.
Mrabet, A.K., Belguith, M., Alhomoud, C. and Emhamed, A.Z. (2020) A Survey of IoT Securi-ty Based on a Layered Architecture of Sensing and Data Analysis. Future Generation Computer Systems, 102, 799-821.
Roman, R., Zhou, J. and Lopez, J. (2013) On the Features and Challenges of Security and Privacy in Distrib-uted Internet of Things. Computers and Electronics in Agriculture, 15, 287-298.
Atzori, L., Iera, A. and Morabito, G. (2010) The Internet of Things: A Survey. Computer Networks, 54, 2787-2805. https://doi.org/10.1016/j.comnet.2010.05.010
Hussain, F., Hussain, R., Hassan, S.A. and Hossain, E. (2020) Machine Learning in Iot Security: Current Solutions and Future Challenges. IEEE Communications Surveys & Tutorials, 22, 1686-1721. https://doi.org/10.1109/comst.2020.2986444
Meidan, Y., Bohadana, M., Shabtai, A., Guarnizo, J.D., Ochoa, M., Tippenhauer, N.O., et al. (2017) ProfilIoT: A Machine Learning Approach for IoT Device Identifi-cation Based on Network Traffic Analysis. Proceedings of the Symposium on Applied Computing, Marrakech, 3-7 April 2017, 506-509. https://doi.org/10.1145/3019612.3019878
Hearst, M.A., Dumais, S.T., Osuna, E., Platt, J. and Scholkopf, B. (1998) Support Vector Machines. IEEE Intelligent Systems and their Applications, 13, 18-28. https://doi.org/10.1109/5254.708428
Widiyasono, A., Fakhrulddin, M. and Kusuma, Y. (2021) IoT De-vice Malware Detection Using Random Forest Algorithm. Proc. Int. Conf. Inf. Technol. Syst., 2021, 234-240.
Chandrashekar, G. and Sahin, F. (2014) A Survey on Feature Selection Methods. Computers & Electrical Engineering, 40, 16-28. https://doi.org/10.1016/j.compeleceng.2013.11.024
Hajjouz, S. and Avksentieva, N. (2022) Autoencoder-Based Anomaly Detection for IoT DDoS Attack Identification. Journal of Network Security, 24, 512-525.
Kumar, R., Singh, S. and Verma, A. (2023) Evaluating Machine Learning Approaches on the CICIoT2023 Dataset: Baseline Performance and Insights. Proceeding of International Conference on Machine Learning for Cybersecurity 2023, 78-92.