资源论文Weighted Likelihood Policy Search with Model Selection

Weighted Likelihood Policy Search with Model Selection

2020-01-13 | |  50 |   25 |   0

Abstract

Reinforcement learning (RL) methods based on direct policy search (DPS) have been actively discussed to achieve an efficient approach to complicated Markov decision processes (MDPs). Although they have brought much progress in practical applications of RL, there still remains an unsolved problem in DPS related to model selection for the policy. In this paper, we propose a novel DPS method, weighted likelihood policy search (WLPS), where a policy is efficiently learned through the weighted likelihood estimation. WLPS naturally connects DPS to the statistical inference problem and thus various sophisticated techniques in statistics can be applied to DPS problems directly. Hence, by following the idea of the information criterion, we develop a new measurement for model comparison in DPS based on the weighted log-likelihood.

上一篇:Synchronization can Control Regularization in Neural Systems via Correlated Noise Processes

下一篇:GenDeR: A Generic Diversified Ranking Algorithm

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...