Liangxiao JIANG, Harry ZHANG, Zhihua CAI
Discriminatively Improving Naive Bayes by Evolutionary Feature Selection

Abstract
.
Improving naive Bayes (simply NB) for classification (often measured by classification accuracy, simply ACC) has received significant attention. In many real-world data mining applications, however, learning a classifier with accurate ranking (often measured by the area under the ROC curve, simply AUC) or probability estimation (often measured by conditional log likelihood, simply CLL) is also desirable. Intuitively, these existing improved algorithms aiming at accurate classification tend to perform poorly when the goal is improving naive Bayes for ranking or probability estimation. This is attributable to a mismatch between the learning process and the learning goal. In order to address this problem, we need a discriminative learning approach to match the learning process and the learning goal. In this paper, we present a discriminative improved algorithm called evolutionary naive Bayes (simply ENB) and design its three different versions in order to achieve three different learninggoals. We name them ENB-ACC, ENB-AUC, and ENB-CLL corresponding to classification, ranking, and probability estimation respectively. Simply speaking, ENB selects attribute subsets by carrying a discriminative evolutionary search through the whole space of attributes. We conduct extensive empirical comparison for naive Bayes and three different versions of evolutionary naive Bayes using the whole 36 UCI data sets selected by Weka. The experimental results show that our improvement is successful when different learning goals are desirable.