Abstract
This paper addressed the problem of formally verifying desirable properties of neural networks, i.e.,
obtaining provable guarantees that neural networks
satisfy specifications relating their inputs and outputs (e.g., robustness to bounded norm adversarial
perturbations). Most previous work on this topic
was limited in its applicability by the size of the network, network architecture and the complexity of
properties to be verified. In contrast, our framework
applies to a general class of activation functions and
specifications. We formulate verification as an optimization problem (seeking to find the largest violation of the specification) and solve a Lagrangian
relaxation of the optimization problem to obtain an
upper bound on the worst case violation of the specification being verified. Our approach is anytime,
i.e., it can be stopped at any time and a valid bound
on the maximum violation can be obtained. Finally,
we highlight how this approach can be used to train
models that are amenable to verification