资源论文RTFM: GENERALISING TO NOVEL ENVIRONMENTDYNAMICS VIA READING

RTFM: GENERALISING TO NOVEL ENVIRONMENTDYNAMICS VIA READING

2020-01-02 | |  63 |   43 |   0

Abstract

Obtaining policies that can generalise to new environments in reinforcement learning is challenging. In this work, we demonstrate that language understanding via a reading policy learner is a promising vehicle for generalisation to new environments. We propose a grounded policy learning problem, Read to Fight Monsters (RTFM), in which the agent must jointly reason over a language goal, relevant dynamics described in a document, and environment observations. We procedurally generate environment dynamics and corresponding language descriptions of the dynamics, such that agents must read to understand new environment dynamics instead of memorising any particular information. In addition, we propose txt2图片.png, a model that captures three-way interactions between the goal, document, and observations. On RTFM, txt2图片.png generalises to new environments with dynamics not seen during training via reading. Furthermore, our model outperforms baselines such as FiLM and language-conditioned CNNs on RTFM. Through curriculum learning, txt2图片.png produces policies that excel on complex RTFM tasks requiring several reasoning and coreference steps.

上一篇:WHY GRADIENT CLIPPING ACCELERATES TRAINING :A THEORETICAL JUSTIFICATION FOR ADAPTIVITY

下一篇:TOWARDS AD EEP NETWORK ARCHITECTURE FORS TRUCTURED SMOOTHNESS

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...