RTFM: GENERALISING TO NOVEL ENVIRONMENTDYNAMICS VIA READING

资源分类

2020-01-02 |

63 |

43 |

Abstract

Obtaining policies that can generalise to new environments in reinforcement learning is challenging. In this work, we demonstrate that language understanding via a reading policy learner is a promising vehicle for generalisation to new environments. We propose a grounded policy learning problem, Read to Fight Monsters (RTFM), in which the agent must jointly reason over a language goal, relevant dynamics described in a document, and environment observations. We procedurally generate environment dynamics and corresponding language descriptions of the dynamics, such that agents must read to understand new environment dynamics instead of memorising any particular information. In addition, we propose txt2 图片.png , a model that captures three-way interactions between the goal, document, and observations. On RTFM, txt2 generalises to new environments with dynamics not seen during training via reading. Furthermore, our model outperforms baselines such as FiLM and language-conditioned CNNs on RTFM. Through curriculum learning, txt2 图片.png produces policies that excel on complex RTFM tasks requiring several reasoning and coreference steps.

上一篇：WHY GRADIENT CLIPPING ACCELERATES TRAINING :A THEORETICAL JUSTIFICATION FOR ADAPTIVITY

下一篇：TOWARDS AD EEP NETWORK ARCHITECTURE FORS TRUCTURED SMOOTHNESS

用户评价

全部评价

还没有评论，说两句吧！

热门资源

Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
The Variational S...

Unlike traditional images which do not offer in...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Rating-Boosted La...

The performance of a recommendation system reli...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com