资源论文Scaling provable adversarial defenses

Scaling provable adversarial defenses

2020-02-14 | |  56 |   41 |   0

Abstract 

Recent work has developed methods for learning deep network classifiers that are provably robust to norm-bounded adversarial perturbation; however, these methods are currently only possible for relatively small feedforward networks. In this paper, in an effort to scale these approaches to substantially larger models, we extend previous work in three main directions. First, we present a technique for extending these training procedures to much more general networks, with skip connections (such as ResNets) and general nonlinearities; the approach is fully modular, and can be implemented automatically (analogous to automatic differentiation). Second, in the specific case of image.png adversarial perturbations and networks with ReLU nonlinearities, we adopt a nonlinear random projection for training, which scales linearly in the number of hidden units (previous approaches scaled quadratically). Third, we show how to further improve robust error through cascade models. On both MNIST and CIFAR data sets, we train classifiers that improve substantially on the state of the art in provable robust adversarial error bounds: from 5.8% to 3.1% on MNIST (with image.png perturbations of image.png = 0.1), and from 80% to 36.4% on CIFAR (with image.png perturbations of image.png = 2/255). Code for all experiments in the paper is available at https://github.com/locuslab/convex_adversarial/.

上一篇:Scalable Robust Matrix Factorization with Nonconvex Loss

下一篇:Theoretical guarantees for the EM algorithm when applied to mis-specified Gaussian mixture models

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...