Abstract
In this paper, we present a novel approach
for incorporating external knowledge in Recurrent Neural Networks (RNNs). We propose the integration of lexicon features into
the self-attention mechanism of RNN-based
architectures. This form of conditioning on
the attention distribution, enforces the contribution of the most salient words for the task
at hand. We introduce three methods, namely
attentional concatenation, feature-based gating
and affine transformation. Experiments on six
benchmark datasets show the effectiveness of
our methods. Attentional feature-based gating
yields consistent performance improvement
across tasks. Our approach is implemented as
a simple add-on module for RNN-based models with minimal computational overhead and
can be adapted to any deep neural architecture