Abstract
We introduce a method for efficiently crowdsourcing
multiclass annotations in challenging, real world image
datasets. Our method is designed to minimize the number of
human annotations that are necessary to achieve a desired
level of confidence on class labels. It is based on combining models of worker behavior with computer vision. Our
method is general: it can handle a large number of classes,
worker labels that come from a taxonomy rather than a flat
list, and can model the dependence of labels when workers
can see a history of previous annotations. Our method may
be used as a drop-in replacement for the majority vote algorithms used in online crowdsourcing services that aggregate
multiple human annotations into a final consolidated label.
In experiments conducted on two real-life applications we
find that our method can reduce the number of required annotations by as much as a factor of 5.4 and can reduce the
residual annotation error by up to 90% when compared with
majority voting. Furthermore, the online risk estimates of
the models may be used to sort the annotated collection and
minimize subsequent expert review effort