资源论文Policy and Value Transfer in Lifelong Reinforcement Learning

Policy and Value Transfer in Lifelong Reinforcement Learning

2020-03-16 | |  87 |   49 |   0

Abstract

We consider the problem of how best to use prior experience to bootstrap lifelong learning, where an agent faces a series of task instances drawn from some task distribution. First, we identify the initial policy that optimizes expected performance over the distribution of tasks for increasingly com plex classes of policy and task distributions. We empirically demonstrate the relative performance of each policy class’ optimal element in a variety of simple task distributions. We then consider value-function initialization methods that preserve PAC guarantees while simultaneously minimizing the learning required in two learning algorithms, yielding M AX QI NIT, a practical new method for value-function-based transfer. We show that M AX QI NIT performs well in simple lifelong RL experiments.

上一篇:Trainable Calibration Measures For Neural Networks From Kernel Mean Embeddings

下一篇:Candidates vs. Noises Estimation for Large Multi-Class Classification Problem

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...