资源论文Nonparametric Bayesian Policy Priors for Reinforcement Learning

Nonparametric Bayesian Policy Priors for Reinforcement Learning

2020-01-08 | |  70 |   46 |   0

Abstract
We consider reinforcement learning in partially observable domains where the agent can query an expert for demonstrations. Our nonparametric Bayesian approach combines model knowledge, inferred from expert information and independent exploration, with policy knowledge inferred from expert trajectories. We introduce priors that bias the agent towards models with both simple representations and simple policies, resulting in improved policy and model learning.

上一篇:Interval Estimation for Reinforcement-Learning Algorithms in Continuous-State Domains

下一篇:Semi-Supervised Learning with Adversarially Missing Label Information

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...