A Corpus for Modeling User and Language Effects
in Argumentation on Online Debating
Abstract
Existing argumentation datasets have succeeded in allowing researchers to develop
computational methods for analyzing the content, structure and linguistic features of argumentative text. They have been much less
successful in fostering studies of the effect
of “user” traits — characteristics and beliefs
of the participants — on the debate/argument
outcome as this type of user information is
generally not available. This paper presents
a dataset of 78, 376 debates generated over a
10-year period along with surprisingly comprehensive participant profiles. We also complete an example study using the dataset to analyze the effect of selected user traits on the
debate outcome in comparison to the linguistic
features typically employed in studies of this
kind.