Abstract
An important precondition to build effective AI
models is the collection of training data at
scale. Crowdsourcing is a popular methodology
to achieve this goal. Its adoption introduces novel
challenges in data quality control, to deal with
under-performing and malicious annotators. One
of the most popular quality assurance mechanisms,
especially in paid micro-task crowdsourcing, is the
use of a small set of pre-annotated tasks as gold
standard, to assess in real time the annotators quality. In this paper, we highlight a set of vulnerabilities this scheme suffers: a group of colluding
crowd workers can easily implement and deploy a
decentralised machine learning inferential system
to detect and signal which parts of the task are
more likely to be gold questions, making them ineffective as a quality control tool. Moreover, we
demonstrate how the most common countermeasures against this attack are ineffective in practical scenarios. The basic architecture of the inferential system is composed of a browser plug-in and
an external server where the colluding workers can
share information. We implement and validate the
attack scheme, by means of experiments on realworld data from a popular crowdsourcing platform.