资源算法 nmt-anuvada

nmt-anuvada

2020-01-13 | |  34 |   0 |   0

nmt-anuvada

We intent to collect parallel dataset for HINDI - ENGLISH language corpus. The primary usage, it to investigate translation accuracy of the mentioned corpus.

#Detail about Corpora IITB Hindi-English parallel corpus(approx size 1.5M) contains the data from the following domain: GNOME         1 KDE            145706 Quran        242933 Chats        430013 Movie Dialogs    434711 General        438933 Hi-Eng Word-Linkage    712818 Admin Dictionary    887993 Admin Examples    954457 Admin Definitions    1001292 TED Talks        1047815 Indic Multi-Parallel 1090398 Judicial I        1100747 Judicial II        1105754 Govt Websites    1109481 Wikipedia        1232841 Book Translations    1265704 Govt Website II    1492827 1561840


上一篇:Anuvadak

下一篇:vqa.pytorch

用户评价
全部评价

热门资源

  • Keras-ResNeXt

    Keras ResNeXt Implementation of ResNeXt models...

  • seetafaceJNI

    项目介绍 基于中科院seetaface2进行封装的JAVA...

  • spark-corenlp

    This package wraps Stanford CoreNLP annotators ...

  • capsnet-with-caps...

    CapsNet with capsule-wise convolution Project ...

  • inferno-boilerplate

    This is a very basic boilerplate example for pe...