资源论文REINFORCEMENT LEARNING WITH COMPETITIVEE NSEMBLES OF INFORMATION -C ONSTRAINEDP RIMITIVES

REINFORCEMENT LEARNING WITH COMPETITIVEE NSEMBLES OF INFORMATION -C ONSTRAINEDP RIMITIVES

2020-01-02 | |  57 |   49 |   0

Abstract

Reinforcement learning agents that operate in diverse and complex environments can benefit from the structured decomposition of their behavior. Often, this is addressed in the context of hierarchical reinforcement learning, where the aim is to decompose a policy into lower-level primitives or options, and a higher-level meta-policy that triggers the appropriate behaviors for a given situation. However, the meta-policy must still produce appropriate decisions in all states. In this work, we propose a policy design that decomposes into primitives, similarly to hierarchical reinforcement learning, but without a high-level meta-policy. Instead, each primitive can decide for themselves whether they wish to act in the current state. We use an information-theoretic mechanism for enabling this decentralized decision: each primitive chooses how much information it needs about the current state to make a decision and the primitive that requests the most information about the current state acts in the world. The primitives are regularized to use as little information as possible, which leads to natural competition and specialization. We experimentally demonstrate that this policy architecture improves over both flat and hierarchical policies in terms of generalization.

上一篇:KEEP DOING WHAT WORKED :B EHAVIOR MODELLING PRIORS FOR OFFLINE REIN -FORCEMENT LEARNING

下一篇:PADÉ ACTIVATION UNITS :E ND -TO -END LEARNINGOF FLEXIBLE ACTIVATION FUNCTIONS IN DEEP NET-WORKS

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...