资源论文Coco-Q: Learning in Stochastic Games with Side Payments

Coco-Q: Learning in Stochastic Games with Side Payments

2020-03-02 | |  93 |   48 |   0

Abstract

Coco (“cooperative/competitive”) values are a solution concept for two-player normalform games with transferable utility, when binding agreements and side payments between players are possible. In this paper, we show that coco values can also be defined for stochastic games and can be learned using a simple variant of Q-learning that is provably convergent. We provide a set of examples showing how the strategies learned by the Coco-Q algorithm relate to those learned by existing multiagent Q-learning algorithms.

上一篇:Ellipsoidal Multiple Instance Learning

下一篇:Principal Component Analysis on non-Gaussian Dependent Data

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...