Tuberculosis treatment course completion is crucial to protect patients against prolonged infectiousness, relapse, lengthened and more expensive therapy due to multidrug resistance TB. Up to 50% of all patients do not complete treatment course. To solve this problem, TB treatment with patient supervision and support as an element of the “global plan to stop TB” was considered by the World Health Organization. The plan may require a model to predict the outcome of DOTS therapy; then, this tool may be used to determine how intensive the level of providing services and supports should be. This work applied and compared machine learning techniques initially to predict the outcome of TB therapy. After feature analysis, models by six algorithms including decision tree (DT), artificial neural network (ANN), logistic regression (LR), radial basis function (RBF), Bayesian networks (BN), and support vector machine (SVM) developed and validated. Data of training (N = 4515) and testing (N = 1935) sets were applied and models evaluated by prediction accuracy, F-measure and recall. Seventeen significantly correlated features were identified (P <= 0.004; 95% CI = 0.001 - 0.007); DT (C 4.5) was found to be the best algorithm with %74.21 prediction accuracy in comparing with ANN, BN, LR, RBF, and SVM with 62.06%, 57.88%, 57.31%, 53.74%, and 51.36% respectively. Data and distribution may create the opportunity for DT out performance. The predicted class for each TB case might be useful for improving the quality of care through making patients’ supervision and support more case—sensitive in order to enhance the quality of DOTS therapy.
We have presented an integrated approach
based on supervised and unsupervised learning tech- nique to improve the
accuracy of six predictive models. They are developed to predict outcome of
tuberculosis treatment course and their accuracy needs to be improved as they
are not precise as much as necessary. The integrated supervised and unsupervised
learning method (ISULM) has been proposed as a new way to improve model
accuracy. The dataset of 6450 Iranian TB patients under DOTS therapy was
applied to initially select the significant predictors and then develop six predictive
models using decision tree, Bayesian network, logistic regression, multilayer
perceptron, radial basis function, and support vector machine algorithms.
Developed models have integrated with k-mean clustering analysis to calculate
more accurate predicted outcome of tuberculosis treatment course. Obtained
results, then, have been evaluated to compare prediction accuracy before and
after ISULM application. Recall, Precision, F-measure, and ROC area are other
criteria used to assess the models validity as well as change percentage to
show how different are models before and after ISULM. ISULM led to improve the
prediction accuracy for all applied classifiers ranging between 4% and 10%.
The most and least improvement for prediction accuracy were shown by logistic
regression and support vector machine respectively. Pre-learning by k- mean
clustering to relocate the objects and put similar cases in the same group can
improve the classification accuracy in the process of integrating supervised
and unsupervised learning.