Abstract
We present a simple new method where an
emergent NMT system is used for simultaneously selecting training data and learning internal NMT representations. This is done in
a self-supervised way without parallel data,
in such a way that both tasks enhance each
other during training. The method is language independent, introduces no additional
hyper-parameters, and achieves BLEU scores
of 29.21 (en2fr) and 27.36 (fr2en) on newstest2014 using English and French Wikipedia
data for training.