资源论文POSTERIOR SAMPLING FOR MULTI -AGENT REINFORCE -MENT LEARNING :SOLVING EXTENSIVE GAMES WITHIMPERFECT INFORMATION

POSTERIOR SAMPLING FOR MULTI -AGENT REINFORCE -MENT LEARNING :SOLVING EXTENSIVE GAMES WITHIMPERFECT INFORMATION

2020-01-02 | |  143 |   68 |   0

Abstract

Posterior sampling for reinforcement learning (PSRL) is a useful framework for making decisions in an unknown environment. PSRL maintains a posterior distribution of the environment and then makes planning on an environment sampled from the posterior distribution. Though PSRL works well on single-agent reinforcement learning problems, how to apply PSRL to multi-agent reinforcement learning problems is largely unexplored. In this work, we extend PSRL to twoplayer zero-sum extensive-games with imperfect information (TZIEG), which is a class of multi-agent systems. Technically, we combine PSRL with counterfactual regret minimization (CFR), which is a leading algorithm for TZIEG with a known environment. Our main contribution is a novel design of interaction strategies. With our interaction strategies, ? our algorithm provably converges to the Nash Equilibrium at a rate of O( log T /T ). Empirical results show that our algorithm works well.

上一篇:SINGLE EPISODE TRANSFER FOR DIFFERING ENVIRON -MENTAL DYNAMICS IN REINFORCEMENT LEARNING

下一篇:EVOLUTIONARY POPULATION CURRICULUM FORS CALING MULTI -AGENT REINFORCEMENT LEARNING

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Joint Pose and Ex...

    Facial expression recognition (FER) is a challe...