资源论文LAZY-CFR: FAST AND NEAR -OPTIMAL REGRET MIN -IMIZATION FOR EXTENSIVE GAMES WITH IMPERFECT INFORMATION

LAZY-CFR: FAST AND NEAR -OPTIMAL REGRET MIN -IMIZATION FOR EXTENSIVE GAMES WITH IMPERFECT INFORMATION

2020-01-02 | |  176 |   61 |   0

Abstract

Counterfactual regret minimization (CFR) methods are effective for solving twoplayer zero-sum extensive games with imperfect information. However, the vanilla CFR has to traverse the whole game tree in each round, which is time-consuming in large-scale games. In this paper, we present Lazy-CFR, a CFR algorithm that adopts a lazy update strategy to avoid traversing the whole game tree in each round. We prove that the regret of Lazy-CFR is almost the same as the regret of the vanilla CFR and only needs to visit a small portion of the game tree. Thus, Lazy-CFR is provably faster than CFR. Empirical results consistently show that Lazy-CFR is fast in practice.

上一篇:DEEP BATCH ACTIVE LEARNING BYD IVERSE ,U NCERTAIN GRADIENT LOWER BOUNDS

下一篇:DEPTH -W IDTH TRADE -OFFS FOR RE LU NETWORKSVIA SHARKOVSKY ’S THEOREM

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...