资源论文Policy gradients in linearly-solvable MDPs

Policy gradients in linearly-solvable MDPs

2020-01-06 | |  62 |   43 |   0

Abstract

We present policy gradient results within the framework of linearly-solvable MDPs. For the first time, compatible function approximators and natural policy gradients are obtained by estimating the cost-to-go function, rather than the (much larger) state-action advantage function as is necessary in traditional MDPs. We also develop the first compatible function approximators and natural policy gradients for continuous-time stochastic systems.

上一篇:SpikeAnts, a spiking neuron network modelling the emergence of organization in a complex system

下一篇:Learning Multiple Tasks using Manifold Regularization

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Joint Pose and Ex...

    Facial expression recognition (FER) is a challe...