Abstract. This work shows that it is possible to fool/attack recent
state-of-the-art face detectors which are based on the single-stage networks. Successfully attacking face detectors could be a serious malware
vulnerability when deploying a smart surveillance system utilizing face
detectors. In addition, for the privacy concern, it helps prevent faces being harvested and stored in the server. We show that existing adversarial
perturbation methods are not effective to perform such an attack, especially when there are multiple faces in the input image. This is because
the adversarial perturbation specifically generated for one face may disrupt the adversarial perturbation for another face. In this paper, we call
this problem the Instance Perturbation Interference (IPI) problem. This
IPI problem is addressed by studying the relationship between the deep
neural network receptive field and the adversarial perturbation. Besides
the single-stage face detector, we find that the IPI problem also exists on
the first stage of the Faster-RCNN, the commonly used two-stage object
detector. As such, we propose the Localized Instance Perturbation (LIP)
that confines the adversarial perturbation inside the Effective Receptive
Field (ERF) of a target to perform the attack. Experimental results show
the LIP method massively outperforms existing adversarial perturbation
generation methods – often by a factor of 2 to 10