资源论文RAAM: The Benefits of Robustness in Approximating Aggregated MDPs in Reinforcement Learning

RAAM: The Benefits of Robustness in Approximating Aggregated MDPs in Reinforcement Learning

2020-01-20 | |  97 |   57 |   0

Abstract

We describe how to use robust Markov decision processes for value function approximation with state aggregation. The robustness serves to reduce the sensitivity to the approximation error of sub-optimal policies in comparison to classical methods such as fitted value iteration. This results in reducing the bounds on the 图片.png-discounted infinite horizon performance loss by a factor of 图片.png while preserving polynomial-time computational complexity. Our experimental results show that using the robust representation can significantly improve the solution quality with minimal additional computational cost.

上一篇:Unsupervised learning of an efficient short-term memory network

下一篇:Unsupervised Deep Haar Scattering on Graphs

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...