Abstract
In this work we present a new dataset of literary events—events that are depicted as taking place within the imagined space of a novel.
While previous work has focused on event detection in the domain of contemporary news,
literature poses a number of complications
for existing systems, including complex narration, the depiction of a broad array of mental states, and a strong emphasis on figurative
language. We outline the annotation decisions
of this new dataset and compare several models for predicting events; the best performing
model, a bidirectional LSTM with BERT token representations, achieves an F1 score of
73.9. We then apply this model to a corpus of novels split across two dimensions—
prestige and popularity—and demonstrate that
there are statistically significant differences in
the distribution of events for prestige.