Abstract. Deep neural networks have lately shown tremendous performance in various applications including vision and speech processing
tasks. However, alongside their ability to perform these tasks with such
high accuracy, it has been shown that they are highly susceptible to adversarial attacks: a small change in the input would cause the network
to err with high confidence. This phenomenon exposes an inherent fault
in these networks and their ability to generalize well. For this reason,
providing robustness to adversarial attacks is an important challenge in
networks training, which has led to extensive research. In this work, we
suggest a theoretically inspired novel approach to improve the networks’
robustness. Our method applies regularization using the Frobenius norm
of the Jacobian of the network, which is applied as post-processing, after
regular training has finished. We demonstrate empirically that it leads
to enhanced robustness results with a minimal change in the original
network’s accuracy