资源论文An Autoencoder Approach to Learning Bilingual Word Representations

An Autoencoder Approach to Learning Bilingual Word Representations

2020-01-17 | |  87 |   54 |   0

Abstract

Cross-language learning allows one to use training data from one language to build models for a different language. Many approaches to bilingual learning require that we have word-level alignment of sentences from parallel corpora. In this work we explore the use of autoencoder-based methods for cross-language learning of vectorial word representations that are coherent between two languages, while not relying on word-level alignments. We show that by simply learning to reconstruct the bag-of-words representations of aligned sentences, within and between languages, we can in fact learn high-quality representations and do without word alignments. We empirically investigate the success of our approach on the problem of cross-language text classification, where a classifier trained on a given language (e.g., English) must learn to generalize to a different language (e.g., German). In experiments on 3 language pairs, we show that our approach achieves state-of-the-art performance, outperforming a method exploiting word alignments and a strong machine translation baseline.

上一篇:A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input

下一篇:Convolutional Neural Network Architectures for Matching Natural Language Sentences

用户评价
全部评价

热门资源

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • dynamical system ...

    allows to preform manipulations of heavy or bul...

  • The Variational S...

    Unlike traditional images which do not offer in...