Abstract
Prior researches suggest that neural machine
translation (NMT) captures word alignment
through its attention mechanism, however, this
paper finds attention may almost fail to capture word alignment for some NMT models.
This paper thereby proposes two methods to
induce word alignment which are general and
agnostic to specific NMT models. Experiments show that both methods induce much
better word alignment than attention. This paper further visualizes the translation through
the word alignment induced by NMT. In particular, it analyzes the effect of alignment errors on translation errors at word level and its
quantitative analysis over many testing examples consistently demonstrate that alignment
errors are likely to lead to translation errors
measured by different metrics