资源论文Path Integral Policy Improvement with Covariance Matrix Adaptation

Path Integral Policy Improvement with Covariance Matrix Adaptation

2020-02-28 | |  59 |   42 |   0

Abstract

There has been a recent focus in reinforcement learning on addressing continuous state and action problems by optimizing parameterized policies. 图片.png is a recent example of this approach. It combines a derivation from first principles of stochastic optimal control with tools from statistical estimation theory. In this paper, we consider 图片.png as a member of the wider family of methods which share the concept of probability-weighted averaging to iteratively update parameters to optimize a cost function. We compare 图片.png to other members of the same family – Cross-Entropy Methods and CMAES – at the conceptual level and in terms of performance. The comparison suggests the derivation of a novel algorithm which we call 图片.png -CMA for “Path Integral Policy Improvement with Covariance Matrix Adaptation”. 图片.png -CMA’s main advantage is that it determines the magnitude of the exploration noise automatically.

上一篇:Similarity Learning for Provably Accurate Sparse Linear Classification

下一篇:Predicting accurate probabilities with a ranking loss

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...