Abstract
Artificial Intelligence has seen several breakthroughs in two-player perfect information game.
Nevertheless, Doudizhu, a three-player imperfect
information game, is still quite challenging. In this
paper, we present a Doudizhu AI by applying deep
reinforcement learning from games of self-play.
The algorithm combines an asymmetric MCTS on
nodes representing each player’s information set, a
policy-value network that approximates the policy
and value on each decision node, and inference on
unobserved hands of other players by given policy.
Our results show that self-play can significantly improve the performance of our agent in this multiagent imperfect information game. Even starting
with a weak AI, our agent can achieve human expert level after days of self-play and training