All Title Author
Keywords Abstract

Evaluation of TB Patients Characteristics Based on Predictive Data Mining Approaches

DOI: 10.4236/jtr.2017.51002, PP. 13-22

Keywords: TB Patients, Clustering, Decision Tree, Neural Network

Full-Text   Cite this paper   Add to My Lib


According to the World Health Organization, Tb is the biggest cause of death among the infectious diseases. Due to the high percentage of people with tuberculosis infection and the high number of death among these patients, this study is a prospective study aimed to categorize and find the relationship between different clinical and demographic characteristics. The study was conducted on 600 patients from Masih-e-Daneshvari tuberculosis research center during 2015-2016. The K-Means clustering data mining algorithms and decision trees are used to perform the categorization and determine common indicators among patients. 2 clusters according to Dunn index were chosen as the optimal clusters. Common factors between clusters are provided in detail in the findings section. According to the results of this study, the most important factors identified by the clustering include hemoglobin, age, sex, smoking, alcohol consumption and creatinine. The RBF neural network tree has 98% accuracy. According to the results of this study, the most important factors identified are sex, smoking, alcohol consumption and WBC, albumin.


[1]  Nasehim, M.L. (2008) National Guidelines TB. 2nd Edition, Andishmand Publication, Tehran. (Persian)
[2]  Al Jarullah, A.A. (2011) Decision Tree Discovery for the Diagnosis of Type II Diabetes. International Conference on IEEE Innovations in Information Technology (IIT), 25-27 April 2011, 303-307.
[3]  Khajehei, M. and Etemady, F. (2010) Data Mining and Medical Research Studies. Cimsim. 2nd International Conference on Computational Intelligence, Modelling and Simulation, 28-30 September 2010, 119-122.
[4]  Jayalakshmi, T. and Santhakumaran, A. (2010) A Novel Classification Method for Diagnosis of Diabetes Mellitus Using Artificial Neural Networks. International Conference on IEEE Data Storage and Data Engineering (DSDE), 9-10 February 2010, 159-163.
[5]  Ameri, H. (2013) Using Data Mining in Diabetes. Master of Science Seminar in Information Technology (Ecommerce), K. N. Toosi University of Technology. (Persian)
[6]  Ameri, H., Alizadeh, S. and Hadizadeh, M. (2014) Assessing the Effects of Infertility Treatment Drugs Using Clustering Algorithms and Data Mining Techniques. Journal of Mazandaran University of Medical Sciences, 24, 26-35. (Persian)
[7]  Nagabhushanam, D., Naresh, N., Raghunath, A. and Praveen Kumar, K. (2013) Prediction of Tuberculosis Using Data Mining Techniques on Indian Patient’s Data. IJCST, 4, 262-265.
[8]  Uçar, T., Karahoca, A. and Karahoca, D. (2012) Tuberculosis Disease Diagnosis by Using Adaptive Neuro Fuzzy Inference System and Rough Sets. Neural Computing & Applications, 23.
[9]  Uçar, T. and Karahoca, A. (2011) Predicting Existence of Mycobacterium tuberculosis on Patients Using Data Mining Approaches. Procedia Computer Science, 3, 1404-1411.
[10]  Han, J. and Kamber, M. (2006) Chapter 1: Introduction: Data Mining: Concepts and Techniques. 2nd Edition, Morgan Kaufman Publisher, San Francisco.
[11]  Rusdah, E.W. (2013) Review on Data Mining Methods for Tuberculosis Diagnosis. Information Systems International Conference (ISICO), Bali, 2-4 December 2013, 563-568.
[12]  Newman, D.S., Hettich, J., Blake, C.L.S. and Merz, C.J. (1998) UCI Repository of Machine Learning Databases. University of California, Department of Information and Computer Science, Irvine.
[13]  Chen, G. and Astebro, T. (2003) How to Deal with Missing Categorical Data: Test of a Simple Bayesian Method. Organizational Research Methods, 6, 309-327.
[14]  Burris, C.A., Ashwood, E.R. and Burns, D.E. (2006) Tietz Textbook of Clinical Chemistry and Molecular Diagnostics. 4th Edition, Elsivier Saunders, St. Louis, 962-967.
[15]  McPherson, R.A., Mattew, R. and Princus, M.R. (2011) Henry’s Clinical Diagnosis and Management by Laboratory Methods. 22th Edition, Elsivier Saunders, Philadelphia, 254-255.
[16]  McPherson, R.A. and Pincous, M.R. (2011) Henry’s Clinical Diagnosis and Management by Laboratory Methods: Expert Consult-Online and Print. 22th Edition, Saunders, Philadelphia.
[17]  Lujambio, I., Sottolano, M., Luzardo, L., Robinia, S., Krul, N., Thijs, L., et al. (2014) Estimation of Glomerular Filtration Rate Based on Serum Crystain C versus Creatinine in Uruguayan Population. International Journal of Nephrology, 2014, Article ID: 837106.
[18]  Alizadeh, S., Ghazanfari, M. and Teimorpour, B. (2011) Data Mining and Knowledge Discovery. 2nd Edition, Publication of Iran University of Science and Technology, Tehran. (In Persian)
[19]  Asha, T., Natarajan, S. and Murthy, K.N.B. (2011) A Data Mining Approach to the Diagnosis of Tuberculosis by Cascading Clustering and Classification. Journal of Computing, 3.
[20]  Bakar, A.A. and Febriyani, F. (2007) Rough Neural Network Model for Tuberculosis Patient Categorization. Proceedings of the International Conference on Electrical Engineering and Informatics, Vol. 1, Bandung, 17-19 June 2007, 765-768.
[21]  Abdallah, T.M. and Abdel, A.A. (2012) Epidemiology of Tuberculosis in Eastern Sudan. Asian Pacific Journal of Tropical Biomedicine, 2, 999-1001.


comments powered by Disqus