资源论文Policy-Conditioned Uncertainty Sets for Robust Markov Decision Processes

Policy-Conditioned Uncertainty Sets for Robust Markov Decision Processes

2020-02-14 | |  43 |   36 |   0

Abstract 

What policy should be employed in a Markov decision process with uncertain parameters? Robust optimization’s answer to this question is to use rectangular uncertainty sets, which independently reflect available knowledge about each state, and then to obtain a decision policy that maximizes the expected reward for the worst-case decision process parameters from these uncertainty sets. While this rectangularity is convenient computationally and leads to tractable solutions, it often produces policies that are too conservative in practice, and does not facilitate knowledge transfer between portions of the state space or across related decision processes. In this work, we propose non-rectangular uncertainty sets that bound marginal moments of state-action features defined over entire trajectories through a decision process. This enables generalization to different portions of the state space while retaining appropriate uncertainty of the decision process. We develop algorithms for solving the resulting robust decision problems, which reduce to finding an optimal policy for a mixture of decision processes, and demonstrate the benefits of our approach experimentally.

上一篇:Experimental Design for Cost-Aware Learning of Causal Graphs

下一篇:Efficient Loss-Based Decoding on Graphs for Extreme Classification

用户评价
全部评价

热门资源

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...