Abstract
Although releasing crowdsourced data brings many
benefits to the data analyzers to conduct statistical analysis, it may violate crowd users’ data privacy. A potential way to address this problem is to
employ traditional differential privacy (DP) mechanisms and perturb the data with some noise before
releasing them. However, considering that there
usually exist conflicts among the crowdsourced data
and these data are usually large in volume, directly
using these mechanisms can not guarantee good
utility in the setting of releasing crowdsourced data.
To address this challenge, in this paper, we propose a novel privacy-aware synthesizing method
(i.e., PrisCrowd) for crowdsourced data, based on
which the data collector can release users’ data with
strong privacy protection for their private information, while at the same time, the data analyzer can
achieve good utility from the released data. Both
theoretical analysis and extensive experiments on
real-world datasets demonstrate the desired performance of the proposed method