资源论文Beyond Confidence Regions: Tight Bayesian Ambiguity Sets for Robust MDPs

Beyond Confidence Regions: Tight Bayesian Ambiguity Sets for Robust MDPs

2020-02-20 | |  84 |   64 |   0

Abstract

Robust MDPs (RMDPs) can be used to compute policies with provable worstcase guarantees in reinforcement learning. The quality and robustness of an RMDP solution are determined by the ambiguity set—the set of plausible transition probabilities—which is usually constructed as a multi-dimensional confidence region. Existing methods construct ambiguity sets as confidence regions using concentration inequalities which leads to overly conservative solutions. This paper proposes a new paradigm that can achieve better solutions with the same robustness guarantees without using confidence regions as ambiguity sets. To incorporate prior knowledge, our algorithms optimize the size and position of ambiguity sets using Bayesian inference. Our theoretical analysis shows the safety of the proposed method, and the empirical results demonstrate its practical promise.

上一篇:Learning Stable Deep Dynamics Models

下一篇:Variance Reduction for Matrix Games

用户评价
全部评价

热门资源

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • dynamical system ...

    allows to preform manipulations of heavy or bul...

  • The Variational S...

    Unlike traditional images which do not offer in...