Abstract
Data poisoning attacks aim to manipulate the model
produced by a learning algorithm by adversarially
modifying the training set. We consider differential
privacy as a defensive measure against this type of
attack. We show that private learners are resistant to
data poisoning attacks when the adversary is only
able to poison a small number of items. However,
this protection degrades as the adversary is allowed
to poison more data. We emprically evaluate this
protection by designing attack algorithms targeting
objective and output perturbation learners, two standard approaches to differentially-private machine
learning. Experiments show that our methods are
effective when the attacker is allowed to poison suf-
ficiently many training items