What to Expect of Classifiers?
Reasoning about Logistic Regression with Missing Features
Abstract
While discriminative classifiers often yield strong
predictive performance, missing feature values at
prediction time can still be a challenge. Classifiers
may not behave as expected under certain ways of
substituting the missing values, since they inherently make assumptions about the data distribution
they were trained on. In this paper, we propose
a novel framework that classifies examples with
missing features by computing the expected prediction with respect to a feature distribution. Moreover, we use geometric programming to learn a
naive Bayes distribution that embeds a given logistic regression classifier and can efficiently take its
expected predictions. Empirical evaluations show
that our model achieves the same performance as
the logistic regression with all features observed,
and outperforms standard imputation techniques
when features go missing during prediction time.
Furthermore, we demonstrate that our method can
be used to generate “sufficient explanations” of logistic regression classifications, by removing features that do not affect the classification