Abstract
The conventional paradigm in neural question
answering (QA) for narrative content is limited
to a two-stage process: first, relevant text passages are retrieved and, subsequently, a neural
network for machine comprehension extracts
the likeliest answer. However, both stages are
largely isolated in the status quo and, hence,
information from the two phases is never properly fused. In contrast, this work proposes
RankQA1
: RankQA extends the conventional
two-stage process in neural QA with a third
stage that performs an additional answer reranking. The re-ranking leverages different
features that are directly extracted from the
QA pipeline, i. e., a combination of retrieval
and comprehension features. While our intentionally simple design allows for an efficient,
data-sparse estimation, it nevertheless outperforms more complex QA systems by a significant margin: in fact, RankQA achieves stateof-the-art performance on 3 out of 4 benchmark datasets. Furthermore, its performance
is especially superior in settings where the size
of the corpus is dynamic. Here the answer reranking provides an effective remedy against
the underlying noise-information trade-off due
to a variable corpus size. As a consequence,
RankQA represents a novel, powerful, and
thus challenging baseline for future research
in content-based QA.