With the rapid increase in population, the rate of diseases like cancer is also increasing. Lung cancer is a leading cause of cancer-related deaths with a minimum survival rate; there is a need to find better, faster, and more accurate methods for early diagnosis of this disease. Although previous research in lung cancer has presented numerous prediction schemes, the feature selection utilized in the schemes and learning process has failed to enhance the accurate performance of lung cancer diagnosis, including incorrect classification and low prediction levels, which lead to misdiagnosis. Prediction of lung cancer cells from lung images in early stages is a question mark for researchers. This study presents a discerning way of predicting lung cancer with the Grey Wolf Optimization Algorithm (GWOA) and Convolutional Neural Networks (CNN). The 14,740 CT scan images are used for classification. The Kaggle dataset, data preprocessing, hyper-parameter feature selection using GWOA, classification using CNN, RF, and DT, cross-validation, and classifier evaluation are the five phases of the proposed lung cancer prediction architecture. The noise present in the data was eliminated by applying a bin smoothing normalization process. In terms of lung cancer prediction, we show that the highest score is achieved when applying CNN with GWOA, which produced the best results with an average performance of 96% accuracy, F1-score, precision, and recall, respectively compared to RF and DT with GWOA. Similarly, the CNN-GWOA produced the lowest false negative rate (FNR) of 0.023676. The low FNR means that it was possible to diagnose lung cancer with very minimal incorrect classification errors. This translates to successful prediction of lung cancer disease correctly.
Cite this paper
Abuya, T. K. , Waithera, W. C. and Kipruto, C. W. (2024). Augmented Lung Cancer Prediction: Leveraging Convolutional Neural Networks and Grey Wolf Optimization Algorithm. Open Access Library Journal, 11, e1172. doi: http://dx.doi.org/10.4236/oalib.1111172.
Siegel, R.L., Miller, K.D. and Jemal, A. (2019) Cancer Statistics, 2019. CA: A Cancer Journal for Clinicians, 69, 7-34. https://doi.org/10.3322/caac.21551
Ferlay, J., Ervik, M., Lam, F., Colombet, M., Mery, L., Pi?eros, M. and Bray, F. (2020) Global Cancer Observatory: Cancer Today. International Agency Research on Cancer, Lyon, France.
Vos, T., Lim, S.S., Abbafati, C., Abbas, K.M., Abbasi, M., Abbasifard, M. and Bhutta, Z.A. (2020) Global Burden of 369 Diseases and Injuries in 204 Countries and Territories, 1990-2019: A Systematic Analysis for the Global Burden of Disease Study 2019. The Lancet, 396, 1204-1222. https://doi.org/10.1016/S0140-6736(20)30925-9
De Martel, C., Georges, D., Bray, F., Ferlay, J. and Clifford, G.M. (2020) Global Burden of Cancer Attributable to Infections in 2018: A Worldwide Incidence Analysis. The Lancet Global Health, 8, E180-E190.
https://doi.org/10.1016/S2214-109X(19)30488-7
Kocarnik, J.M., Compton, K., Dean, F.E., Fu, W., Gaw, B.L., Harvey, J.D., Dhimal, M., et al. (2022) Cancer Incidence, Mortality, Years of Life Lost, Years Lived with Disability, and Disability-Adjusted Life Years for 29 Cancer Groups from 2010 to 2019: A Systematic Analysis for the Global Burden of Disease Study 2019. JAMA Oncology, 8, 420-444. https://doi.org/10.1001/jamaoncol.2021.6987
Bharati, S., Podder, P., Mondal, R., Mahmood, A. and Raihan-Al-Masud, M. (2020) Comparative Performance Analysis of Different Classification Algorithms for Prediction of Lung Cancer. 18th International Conference on Intelligent Systems Design and Applications (ISDA 2018), Vol. 2, Vellore, 6-8 December 2018, 447-457.
https://doi.org/10.1007/978-3-030-16660-1_44
Liu, N., Li, X., Qi, E., Xu, M., Li, L. and Gao, B. (2020) A Novel Ensemble Learning Paradigm for Medical Diagnosis with Imbalanced Data. IEEE Access, 8, 171263-171280.
https://doi.org/10.1109/ACCESS.2020.3014362
Selvanambi, R., Natarajan, J., Karuppiah, M., Islam, S.K., Hassan, M.M. and Fortino, G. (2020) Lung Cancer Prediction Using Higher-Order Recurrent Neural Network Based on Glowworm Swarm Optimization. Neural Computing and Applications, 32, 4373-4386. https://doi.org/10.1007/s00521-018-3824-3
Afolayan, J.O., Adebiyi, M.O., Arowolo, M.O., Chakraborty, C. and Adebiyi, A.A. (2022) Breast Cancer Detection Using Particle Swarm Optimization and Decision Tree Machine Learning Technique. In: Chakraborty, C. and Khosravi, M.R., Eds., Intelligent Healthcare: Infrastructure, Algorithms and Management, Springer Nature, Singapore, 61-83. https://doi.org/10.1007/978-981-16-8150-9_4
Kaur, S., Kumar, Y., Koul, A. and Kamboj, K.S. (2022) A Systematic Review on Metaheuristic Optimization Techniques for Feature Selections in Disease Diagnosis: Open Issues and Challenges. Archives of Computational Methods in Engineering, 30, 1863-1895. https://doi.org/10.1007/s11831-022-09853-1
Subramanian, R.R., Mourya, R.N., Reddy, V.P.T., Reddy, B.N. and Amara, S. (2020) Lung Cancer Prediction Using Deep Learning Framework. International Journal of Control and Automation, 13, 154-160.
Monirujjaman Khan, M., Islam, S., Sarkar, S., Ayaz, F.I., Ananda, M.K., Tazin, T., Almalki, F.A., et al. (2022) Machine Learning-Based Comparative Analysis for Breast Cancer Prediction. Journal of Healthcare Engineering, 2022, Article ID: 4365855. https://doi.org/10.1155/2022/4365855
Yao, L., Zuo, M., Zhang, N., Bai, T. and Huang, Q. (2021) The Efficacy of 18F-FDG PET/CT and Superparamagnetic Nano Ferric Oxide MRI in Diagnosing Lung Cancer and the Value of 18F-FDG PET/CT in Predicting Lymph Node Metastasis. Computational and Mathematical Methods in Medicine, 2021, Article ID: 2448782.
https://doi.org/10.1155/2021/2448782
Hunter, B., Hindocha, S. and Lee, R.W. (2022) The Role of Artificial Intelligence in Early Cancer Diagnosis. Cancers, 14, Article 1524.
https://doi.org/10.3390/cancers14061524
Machado Medeiros, T., Altmayer, S., Watte, G., Zanon, M., Basso Dias, A., Henz Concatto, N. and Hochhegger, B. (2020) 18F-FDG PET/CT and Whole-Body MRI Diagnostic Performance in M Staging for Non-Small Cell Lung Cancer: A Systematic Review and Meta-Analysis. European Radiology, 30, 3641-3649.
https://doi.org/10.1007/s00330-020-06703-1
Kirchner, J., Sawicki, L.M., Nensa, F., Schaarschmidt, B.M., Reis, H., Ingenwerth, M. and Heusch, P. (2019) Prospective Comparison of 18F-FDG PET/MRI and 18F-FDG PET/CT for Thoracic Staging of Non-Small Cell Lung Cancer. European Journal of Nuclear Medicine and Molecular Imaging, 46, 437-445.
https://doi.org/10.1007/s00259-018-4109-x
Solanki, A., Kumar, S., Rohan, C., Singh, S.P. and Tayal, A. (2021) Prediction of Breast and Lung Cancer, Comparative Review and Analysis Using Machine Learning Techniques. In: Singh, S.P., Solanki, A., Sharma, A., Polkowski, Z. and Kumar, R., Eds., Smart Computing and Self-Adaptive Systems, CRC Press, Boca Raton, 251-271. https://doi.org/10.1201/9781003156123-13
Benbrahim, H., Hachimi, H. and Amine, A. (2020) Comparative Study of Machine Learning Algorithms Using the Breast Cancer Dataset. International Conference on Advanced Intelligent Systems for Sustainable Development, Vol. 2, Marrakech, 8-11 July 2019, 83-91. https://doi.org/10.1007/978-3-030-36664-3_10
Valluru, D. and Jeya, I. (2020) IoT with Cloud-Based Lung Cancer Diagnosis Model Using Optimal Support Vector Machine. Health Care Management Science, 23, 670-679. https://doi.org/10.1007/s10729-019-09489-x
Sherafatian, M. and Arjmand, F. (2019) Decision Tree-Based Classifiers for Lung Cancer Diagnosis and Subtyping Using TCGA miRNA Expression Data. Oncology Letters, 18, 2125-2131. https://doi.org/10.3892/ol.2019.10462
Dutta, S. and Bandyopadhyay, S.K. (2020) Early Lung Cancer Prediction Using Neural Network with Cross-Validation. Asian Journal of Research in Infectious Diseases, 4, 15-22. https://doi.org/10.9734/ajrid/2020/v4i430153
Mahfouz, M.A., Shoukry, A. and Ismail, M.A. (2021) EKNN: Ensemble Classifier Incorporating Connectivity and Density into KNN with Application to Cancer Diagnosis. Artificial Intelligence in Medicine, 111, Article 101985.
https://doi.org/10.1016/j.artmed.2020.101985
Ingle, K., Chaskar, U. and Rathod, S. (2021) Lung Cancer Types Prediction Using Machine Learning Approach. 2021 IEEE International Conference on Electronics, Computing and Communication Technologies (CONNECT), Bangalore, 9-11 July 2021, 1-6. https://doi.org/10.1109/CONECCT52877.2021.9622568
Shakeel, P.M., Tolba, A., Al-Makhadmeh, Z. and Jaber, M.M. (2020) Automatic Detection of Lung Cancer from Biomedical Data Set Using Discrete Adaboost Optimized Ensemble Learning Generalized Neural Networks. Neural Computing and Applications, 32, 777-790. https://doi.org/10.1007/s00521-018-03972-2
Hatuwal, B.K. and Thapa, H.C. (2020) Lung Cancer Detection Using Convolutional Neural Network on Histopathological Images. International Journal of Computer Trends and Technology, 68, 21-24.
https://doi.org/10.14445/22312803/IJCTT-V68I10P104
Phankokkruad, M. (2021) Ensemble Transfer Learning for Lung Cancer Detection. 2021 4th International Conference on Data Science and Information Technology, Shanghai, 23-25 July 2021, 438-442. https://doi.org/10.1145/3478905.3478995
Heuvelmans, M.A., Van Ooijen, P.M., Ather, S., Silva, C.F., Han, D., Heussel, C.P., Oudkerk, M., et al. (2021) Lung Cancer Prediction by Deep Learning to Identify Benign Lung Nodules. Lung Cancer, 154, 1-4.
https://doi.org/10.1016/j.lungcan.2021.01.027
Ibrahim, I. and Abdulazeez, A. (2021) The Role of Machine Learning Algorithms for Diagnosing Diseases. Journal of Applied Science and Technology Trends, 2, 10-19. https://doi.org/10.38094/jastt20179
Habuza, T., Navaz, A.N., Hashim, F., Alnajjar, F., Zaki, N., Serhani, M.A. and Statsenko, Y. (2021) AI Applications in Robotics, Diagnostic Image Analysis and Precision Medicine: Current Limitations, Future Trends, Guidelines on CAD Systems Medical. Informatics in Medicine Unlocked, 24, Article 100596.
https://doi.org/10.1016/j.imu.2021.100596
Painuli, D. and Bhardwaj, S. (2022) Recent Cancer Diagnosis Advancements Using Machine Learning and Deep Learning Techniques: A Comprehensive Review. Computers in Biology and Medicine, 146, Article 105580.
https://doi.org/10.1016/j.compbiomed.2022.105580
Sikder, J., Das, U.K. and Chakma, R.J. (2021) Supervised Learning-Based Cancer Detection. International Journal of Advanced Computer Science and Applications, 12, 863-869. https://doi.org/10.14569/IJACSA.2021.01205101
Shanthi, S., Akshaya, V.S., Smitha, J.A. and Bommy, M. (2022) Hybrid TABU Search with SDS Based Feature Selection for Lung Cancer Prediction. International Journal of Intelligent Networks, 3, 143-149.
https://doi.org/10.1016/j.ijin.2022.09.002
Rehman, A., Kashif, M., Abunadi, I. and Ayesha, N. (2021) Lung Cancer Detection and Classification from Chest CT Scans Using Machine Learning Techniques. 2021 1st International Conference on Artificial Intelligence and Data Analytics (CAIDA), Riyadh, 6-7 April 2021, 101-104.
https://doi.org/10.1109/CAIDA51941.2021.9425269
Mohalder, R.D., Sarkar, J.P., Hossain, K.A., Paul, L. and Raihan, M. (2022) Efficient Machine Learning Techniques to Predict Lung Cancer. Proceedings of the 2nd International Conference on Computing Advancements, Dhaka, 10-12 March 2022, 233-239. https://doi.org/10.1145/3542954.3543067
Chaunzwa, T.L., Hosny, A., Xu, Y., Shafer, A., Diao, N., Lanuti, M., Aerts, H.J., et al. (2021) Deep Learning Classification of Lung Cancer Histology Using CT Images. Scientific Reports, 11, Article No. 5471. https://doi.org/10.1038/s41598-021-84630-x
Swaminathan, M., Ramachandran, M., Kumar, A., Rajkumar, K., Khanna, A. and Singh, P. (2022) A Study on Specific Learning Algorithms about Classifying Lung Cancer Disease. Expert Systems, 39, e12797. https://doi.org/10.1111/exsy.12797
Bates, G., Le Gouais, A., Barnfield, A., Callaway, R., Hasan, M.N., Koksal, C., Ayres, S., et al. (2023) Balancing Autonomy and Collaboration in Large-Scale and Disciplinary Diverse Teams for Successful Qualitative Research. International Journal of Qualitative Methods, 22, 1-15. https://doi.org/10.1177/16094069221144594
Kanaya, A.M., Hsing, A.W., Panapasa, S.V., Kandula, N.R., Araneta, M.R.G., Shimbo, D., Hong, Y., et al. (2022) Knowledge Gaps, Challenges, and Opportunities in Health and Prevention Research for Asian Americans, Native Hawaiians, and Pacific Islanders: A Report from the 2021 National Institutes of Health Workshop. Annals of Internal Medicine, 175, 574-589. https://doi.org/10.7326/M21-3729
Naji, M.A., El Filali, S., Aarika, K., Benlahmar, E.H., Abdelouhahid, R.A. and Debauche, O. (2021) Machine Learning Algorithms for Breast Cancer Prediction and Diagnosis. Procedia Computer Science, 191, 487-492.
https://doi.org/10.1016/j.procs.2021.07.062
Sengar, P.P., Gaikwad, M.J. and Nagdive, A.S. (2020) Comparative Study of Machine Learning Algorithms for Breast Cancer Prediction. 2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT), Tirunelveli, 20-22 August 2020, 796-801. https://doi.org/10.1109/ICSSIT48917.2020.9214267
Cunha, M.T., De Souza Borges, A.P., Carvalho Jardim, V., Fujita, A. and De Castro Jr, G. (2022) Predicting Survival in Metastatic Non-Small Cell Lung Cancer Patients with Poor ECOG-PS: A Single-Arm Prospective Study. Cancer Medicine, 12, 5095-5109. https://doi.org/10.1002/cam4.5254
Field, M., Hardcastle, N., Jameson, M., Aherne, N. and Holloway, L. (2021) Machine Learning Applications in Radiation Oncology. Physics and Imaging in Radiation Oncology, 19, 13-24. https://doi.org/10.1016/j.phro.2021.05.007
Sekaran, K., Chandana, P., Krishna, N.M. and Kadry, S. (2020) Deep Learning Convolutional Neural Network (CNN) with Gaussian Mixture Model for Predicting Pancreatic Cancer. Multimedia Tools and Applications, 79, 10233-10247.
https://doi.org/10.1007/s11042-019-7419-5
Rustam, Z. and Kharis, S.A.A. (2020) Comparison of Support Vector Machine Recursive Feature Elimination and Kernel Function as Feature Selection Using Support Vector Machine for Lung Cancer Classification. Journal of Physics: Conference Series, 1442, Article 012027. https://doi.org/10.1088/1742-6596/1442/1/012027
Kanwal, S., Rashid, J., Anjum, N., Nisar, M.W. and Juneja, S. (2022) Feature Selection for Lung and Breast Cancer Disease Prediction Using Machine Learning Techniques. 2022 1st IEEE International Conference on Industrial Electronics: Developments & Applications (ICIDeA), Bhubaneswar, 15-16 October 2022, 163-168.
https://doi.org/10.1109/ICIDeA53933.2022.9970131
Jayaraj, D. and Sathiamoorthy, S. (2019) Random Forest Based Classification Model for Lung Cancer Prediction on Computer Tomography Images. 2019 International Conference on Smart Systems and Inventive Technology (ICSSIT), Tirunelveli, 27-29 November 2019, 100-104. https://doi.org/10.1109/ICSSIT46314.2019.8987772
Eali, S., Eali, S.N.J., Debnath, B. and Midhunchakkaravarthy, J. (2020) An Extensive Review on Lung Cancer Detection Using Machine Learning Techniques: A Systematic Study. Journal of Critical Reviews, 34, 351-435.
Bansal, M., Goyal, A. and Choudhary, A. (2022) A Comparative Analysis of K-Nearest Neighbour, Genetic, Support Vector Machine, Decision Tree, and Long Short Term Memory Algorithms in Machine Learning. Decision Analytics Journal, 3, Article 100071. https://doi.org/10.1016/j.dajour.2022.100071