%0 Journal Article
%T A Hybrid Feature Selection Method Based on Rough Conditional Mutual Information and Naive Bayesian Classifier
%A Zilin Zeng
%A Hongjun Zhang
%A Rui Zhang
%A Youliang Zhang
%J ISRN Applied Mathematics
%D 2014
%R 10.1155/2014/382738
%X We introduced a novel hybrid feature selection method based on rough conditional mutual information and Naive Bayesian classifier. Conditional mutual information is an important metric in feature selection, but it is hard to compute. We introduce a new measure called rough conditional mutual information which is based on rough sets; it is shown that the new measure can substitute Shannon’s conditional mutual information. Thus rough conditional mutual information can also be used to filter the irrelevant and redundant features. Subsequently, to reduce the feature and improve classification accuracy, a wrapper approach based on naive Bayesian classifier is used to search the optimal feature subset in the space of a candidate feature subset which is selected by filter model. Finally, the proposed algorithms are tested on several UCI datasets compared with other classical feature selection methods. The results show that our approach obtains not only high classification accuracy, but also the least number of selected features. 1. Introduction With increase of data dimensionality in many domains such as bioinformatics, text categorization, and image recognition, feature selection has become one of the most important data mining preprocessing methods. The aim of feature selection is to find a minimal feature subset of the original datasets that is the most characterizing. Since feature selection can bring lots of advantages, such as avoiding overfitting, facilitating data visualization, reducing storage requirements, and reducing training times, it has attracted considerable attention in various areas [1]. In the past two decades, different techniques are proposed to address these challenging tasks. Dash and Liu [2] point out that there are four basic steps in a typical feature selection method, that is, subset generation, subset evaluation, stopping criterion, and validation. Most studies focus on the two major steps of feature selection: subset generation and subset evaluation. According to subset evaluation function, feature selection methods can be divided into two categories: filter method and wrapper method [3]. Filter methods are independent of predictor, whereas wrapper methods utilize their predictive power as the evaluation function. The merits of filter methods are high computation efficiency and its generality. However, the result of filter method is not always satisfactory. This is because the filter model separates feature selection from the classifier learning and selects the feature subsets that are independent from the learning algorithm. On
%U http://www.hindawi.com/journals/isrn.applied.mathematics/2014/382738/