Abstract
With social media becoming increasingly popular on which lots of news and real-time events
are reported, developing automated question
answering systems is critical to the effectiveness of many applications that rely on realtime knowledge. While previous datasets have
concentrated on question answering (QA) for
formal text like news and Wikipedia, we
present the first large-scale dataset for QA over
social media data. To ensure that the tweets
we collected are useful, we only gather tweets
used by journalists to write news articles. We
then ask human annotators to write questions
and answers upon these tweets. Unlike other
QA datasets like SQuAD in which the answers
are extractive, we allow the answers to be abstractive. We show that two recently proposed
neural models that perform well on formal
texts are limited in their performance when applied to our dataset. In addition, even the finetuned BERT model is still lagging behind human performance with a large margin. Our results thus point to the need of improved QA
systems targeting social media text.