CONAN - COunter NArratives through Nichesourcing:
a Multilingual Dataset of Responses to Fight Online Hate Speech
Abstract
Although there is an unprecedented effort to
provide adequate responses in terms of laws
and policies to hate content on social media
platforms, dealing with hatred online is still a
tough problem. Tackling hate speech in the
standard way of content deletion or user suspension may be charged with censorship and
overblocking. One alternate strategy, that has
received little attention so far by the research
community, is to actually oppose hate content with counter-narratives (i.e. informed textual responses). In this paper, we describe the
creation of the first large-scale, multilingual,
expert-based dataset of hate speech/counternarrative pairs. This dataset has been built
with the effort of more than 100 operators from
three different NGOs that applied their training and expertise to the task. Together with the
collected data we also provide additional annotations about expert demographics, hate and
response type, and data augmentation through
translation and paraphrasing. Finally, we provide initial experiments to assess the quality of
our data