Abstract
Crowdsourcing services provide a fast, efficient,
and cost-effective means of obtaining large labeled
data for supervised learning. Ground truth inference, also called label integration, designs proper
aggregation strategies to infer the unknown true label of each instance from the multiple noisy label
set provided by ordinary crowd workers. However, to the best of our knowledge, nearly all existing
label integration methods focus solely on the multiple noisy label set itself of the individual instance
while totally ignoring the intercorrelation among
multiple noisy label sets of different instances. To
solve this problem, a multiple noisy label distribution propagation (MNLDP) method is proposed in
this study. MNLDP first transforms the multiple
noisy label set of each instance into its multiple
noisy label distribution and then propagates its multiple noisy label distribution to its nearest neighbors. Consequently, each instance absorbs a fraction of the multiple noisy label distributions from
its nearest neighbors and yet simultaneously maintains a fraction of its own original multiple noisy
label distribution. Promising experimental results on simulated and real-world datasets validate the
effectiveness of our proposed method