资源论文JW300: A Wide-Coverage Parallel Corpus for Low-Resource Languages

JW300: A Wide-Coverage Parallel Corpus for Low-Resource Languages

2019-09-20 | |  213 |   112 |   0 0 0
Abstract Viable cross-lingual transfer critically depends on the availability of parallel texts. Shortage of such resources imposes a development and evaluation bottleneck in multilingual processing. We introduce JW300, a parallel corpus of over 300 languages with around 100 thousand parallel sentences per language pair on average. In this paper, we present the resource and showcase its utility in experiments with crosslingual word embedding induction and multisource part-of-speech projection

上一篇:Identification of Tasks, Datasets, Evaluation Metrics, and Numeric Scores for Scientific Leaderboards Construction

下一篇:Knowledge-aware Pronoun Coreference Resolution

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...