Abstract
Recent studies show that the state-of-the-art deep neural
networks (DNNs) are vulnerable to adversarial examples,
resulting from small-magnitude perturbations added to the
input. Given that that emerging physical systems are using DNNs in safety-critical situations, adversarial examples
could mislead these systems and cause dangerous situations.
Therefore, understanding adversarial examples in the physical world is an important step towards developing resilient
learning algorithms. We propose a general attack algorithm,
Robust Physical Perturbations (RP2), to generate robust
visual adversarial perturbations under different physical
conditions. Using the real-world case of road sign classifi-
cation, we show that adversarial examples generated using
RP2 achieve high targeted misclassification rates against
standard-architecture road sign classifiers in the physical
world under various environmental conditions, including
viewpoints. Due to the current lack of a standardized testing
method, we propose a two-stage evaluation methodology for
robust physical adversarial examples consisting of lab and
field tests. Using this methodology, we evaluate the efficacy
of physical adversarial manipulations on real objects. With
a perturbation in the form of only black and white stickers,
we attack a real stop sign, causing targeted misclassification
in 100% of the images obtained in lab settings, and in 84.8%
of the captured video frames obtained on a moving vehicle
(field test) for the target classifier