资源论文Beyond Confidence Regions: Tight Bayesian Ambiguity Sets for Robust MDPs

Beyond Confidence Regions: Tight Bayesian Ambiguity Sets for Robust MDPs

2020-02-20 | |  75 |   56 |   0

Abstract

Robust MDPs (RMDPs) can be used to compute policies with provable worstcase guarantees in reinforcement learning. The quality and robustness of an RMDP solution are determined by the ambiguity set—the set of plausible transition probabilities—which is usually constructed as a multi-dimensional confidence region. Existing methods construct ambiguity sets as confidence regions using concentration inequalities which leads to overly conservative solutions. This paper proposes a new paradigm that can achieve better solutions with the same robustness guarantees without using confidence regions as ambiguity sets. To incorporate prior knowledge, our algorithms optimize the size and position of ambiguity sets using Bayesian inference. Our theoretical analysis shows the safety of the proposed method, and the empirical results demonstrate its practical promise.

上一篇:Learning Stable Deep Dynamics Models

下一篇:Variance Reduction for Matrix Games

用户评价
全部评价

热门资源

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • Rating-Boosted La...

    The performance of a recommendation system reli...

  • Hierarchical Task...

    We extend hierarchical task network planning wi...