Numeracy-600K: Learning Numeracy
for Detecting Exaggerated Information in Market Comments
Abstract
In this paper, we attempt to answer the question of whether neural network models can
learn numeracy, which is the ability to predict
the magnitude of a numeral at some specific
position in a text description. A large benchmark dataset, called Numeracy-600K, is provided for the novel task. We explore several
neural network models including CNN, GRU,
BiGRU, CRNN, CNN-capsule, GRU-capsule,
and BiGRU-capsule in the experiments. The
results show that the BiGRU model gets the
best micro-averaged F1 score of 80.16%, and
the GRU-capsule model gets the best macroaveraged F1 score of 64.71%. Besides discussing the challenges through comprehensive
experiments, we also present an important application scenario, i.e., detecting exaggerated
information, for the task