Abstract
In this work, we present novel approaches to
exploit sentential context for neural machine
translation (NMT). Specifically, we first show
that a shallow sentential context extracted
from the top encoder layer only, can improve
translation performance via contextualizing
the encoding representations of individual
words. Next, we introduce a deep sentential
context, which aggregates the sentential
context representations from all the internal
layers of the encoder to form a more comprehensive context representation. Experimental
results on the WMT14 English?German
and English?French benchmarks show
that our model consistently improves performance over the strong TRANSFORMER
model (Vaswani et al., 2017), demonstrating
the necessity and effectiveness of exploiting
sentential context for NMT