Abstract
Viable cross-lingual transfer critically depends
on the availability of parallel texts. Shortage
of such resources imposes a development and
evaluation bottleneck in multilingual processing. We introduce JW300, a parallel corpus of
over 300 languages with around 100 thousand
parallel sentences per language pair on average. In this paper, we present the resource and
showcase its utility in experiments with crosslingual word embedding induction and multisource part-of-speech projection