Nonparametric Bayesian Policy Priors for Reinforcement Learning

登录免费注册

资源分类

论文
算法
数据集
经验分享
技术动态
行业动态

论文
学习
研究领域

算法
学习
研究领域

数据集
自动驾驶
图片

经验分享
学习
研究领域

技术动态
计算机视觉
自然语言处理

行业动态
教育
语音识别

》资源》论文》Nonparametric Bayesian Policy Priors for Reinforcement Learning

Nonparametric Bayesian Policy Priors for Reinforcement Learning

2020-01-08 |

70 |

46 |

Nonparametric Bayesian Policy Priors for Reinforcement Learning
论文

Abstract
We consider reinforcement learning in partially observable domains where the agent can query an expert for demonstrations. Our nonparametric Bayesian approach combines model knowledge, inferred from expert information and independent exploration, with policy knowledge inferred from expert trajectories. We introduce priors that bias the agent towards models with both simple representations and simple policies, resulting in improved policy and model learning.

上一篇：Interval Estimation for Reinforcement-Learning Algorithms in Continuous-State Domains

下一篇：Semi-Supervised Learning with Adversarially Missing Label Information

用户评价