Abstract
In many applications, one can define a large set
of features to support the classification task at
hand. At test time, however, these become prohibitively expensive to evaluate, and only a small
subset of features is used, often selected for their
information-theoretic value. For threshold-based,
Naive Bayes classifiers, recent work has suggested
selecting features that maximize the expected robustness of the classifier, that is, the expected probability it maintains its decision after seeing more
features. We propose the first algorithm to compute
this expected same-decision probability for general Bayesian network classifiers, based on compiling the network into a tractable circuit representation. Moreover, we develop a search algorithm for optimal feature selection that utilizes ef-
ficient incremental circuit modifications. Experiments on Naive Bayes, as well as more general networks, show the efficacy and distinct behavior of
this decision-making approach