资源论文Latent Space Policies for Hierarchical Reinforcement Learning

Latent Space Policies for Hierarchical Reinforcement Learning

2020-03-16 | |  94 |   45 |   0

Abstract

We address the problem of learning hierarchical deep neural network policies for reinforcement learning. In contrast to methods that explicitly re strict or cripple lower layers of a hierarchy to fo them to use higher-level modulating signals, each layer in our framework is trained to directly solve the task, but acquires a range of diverse strategie via a maximum entropy reinforcement learning objective. Each layer is also augmented with latent random variables, which are sampled from a prior distribution during the training of that laye The maximum entropy objective causes these latent variables to be incorporated into the layer’s policy, and the higher level layer can directly con trol the behavior of the lower layer through this latent space. Furthermore, by constraining the mapping from latent variables to actions to be invertible, higher layers retain full expressivity: neither the higher layers nor the lower layers are constrained in their behavior. Our experimental evaluation demonstrates that we can improve on the performance of single-layer policies on standard benchmark tasks simply by adding additional layers, and that our method can solve more complex sparse-reward tasks by learning higher-level policies on top of high-entropy skills optimized for simple low-level objectives.

上一篇:Optimizing the Latent Space of Generative Networks

下一篇:Mean Field Multi-Agent Reinforcement Learning

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...