Abstract
Adaptive schemes, where tasks are assigned based on the data collected thus far, are widely used in practical crowdsourcing systems to efficiently allocate the budget. However, existing theoretical analyses of crowdsourcing systems suggest that the gain of adaptive task assignments is minimal. To bridge this gap, we investigate this question under a strictly more general probabilistic model, which has been recently introduced to model practical crowdsourcing datasets. Under this generalized Dawid-Skene model, we characterize the fundamental trade-off between budget and accuracy. We introduce a novel adaptive scheme that matches this fundamental limit. A given budget is allocated over multiple rounds. In each round, a subset of tasks with high enough confidence are classified, and increasing budget is allocated on remaining ones that are potentially more difficult. On each round, decisions are made based on the leading eigenvector of (weighted) non-backtracking operator corresponding to the bipartite assignment graph. We further quantify the gain of adaptivity, by comparing the tradeoff with the one for non-adaptive schemes, and confirm that the gain is significant and can be made arbitrarily large depending on the distribution of the difficulty level of the tasks at hand.