Abstract
We describe a multi-task learning approach
to train a Neural Machine Translation (NMT)
model with a Relevance-based Auxiliary Task
(RAT) for search query translation. The translation process for Cross-lingual Information
Retrieval (CLIR) task is usually treated as a
black box and it is performed as an independent step. However, an NMT model trained
on sentence-level parallel data is not aware of
the vocabulary distribution of the retrieval corpus. We address this problem with our multitask learning architecture that achieves 16%
improvement over a strong NMT baseline on
Italian-English query-document dataset. We
show using both quantitative and qualitative
analysis that our model generates balanced
and precise translations with the regularization effect it achieves from multi-task learning
paradigm.