Why ReLU networks yield high-confidence predictions far away fromthe training data and how to mitigate the problem
Abstract
Classifiers used in the wild, in particular for safetycritical systems, should not only have good generalization properties but also should know when they don’t
know, in particular make low confidence predictions far
away from the training data. We show that ReLU type
neural networks which yield a piecewise linear classi-
fier function fail in this regard as they produce almost
always high confidence predictions far away from the
training data. For bounded domains like images we
propose a new robust optimization technique similar
to adversarial training which enforces low confidence
predictions far away from the training data. We show
that this technique is surprisingly effective in reducing
the confidence of predictions far away from the training
data while maintaining high confidence predictions and
test error on the original classification task compared
to standard training.