资源论文Learning by Playing – Solving Sparse Reward Tasks from Scratch

Learning by Playing – Solving Sparse Reward Tasks from Scratch

2020-03-19 | |  75 |   54 |   0

Abstract

We propose Scheduled Auxiliary Control (SACX), a new learning paradigm in the context of Reinforcement Learning (RL). SAC-X enables learning of complex behaviors – from scratch – in the presence of multiple sparse reward signals. To this end, the agent is equipped with a set of general auxiliary tasks, that it attempts to learn sim taneously via off-policy RL. The key idea behind our method is that active (learned) scheduling and execution of auxiliary policies allows the agent to efficiently explore its environment – enabling it to excel at sparse reward RL. Our experiments in several challenging robotic manipulation settings demonstrate the power of our approach. A video of the rich set of learned behaviours can be found at https://youtu.be/mPKyvocNe M.

上一篇:Optimal Distributed Learning with Multi-pass Stochastic Gradient Methods

下一篇:Competitive Caching with Machine Learned Advice

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...