资源论文CM3: COOPERATIVE MULTI -GOAL MULTI -STAGEM ULTI -AGENT REINFORCEMENT LEARNING

CM3: COOPERATIVE MULTI -GOAL MULTI -STAGEM ULTI -AGENT REINFORCEMENT LEARNING

2020-01-02 | |  50 |   37 |   0

Abstract

A variety of cooperative multi-agent control problems require agents to achieve individual goals while contributing to collective success. This multi-goal multiagent setting poses difficulties for recent algorithms, which primarily target settings with a single global reward, due to two new challenges: efficient exploration for learning both individual goal attainment and cooperation for others’ success, and credit-assignment for interactions between actions and goals of different agents. To address both challenges, we restructure the problem into a novel two-stage curriculum, in which single-agent goal attainment is learned prior to learning multi-agent cooperation, and we derive a new multi-goal multi-agent policy gradient with a credit function for localized credit assignment. We use a function augmentation scheme to bridge value and policy functions across the curriculum. The complete architecture, called CM3, learns significantly faster than direct adaptations of existing algorithms on three challenging multi-goal multi-agent problems: cooperative navigation in difficult formations, negotiating multi-vehicle lane changes in the SUMO traffic simulator, and strategic cooperation in a Checkers environment.

上一篇:WATCH THE UNOBSERVED :A SIMPLE APPROACH TOPARALLELIZING MONTE CARLO TREE SEARCH

下一篇:GEN DICE: GENERALIZED OFFLINE ESTIMATION OFS TATIONARY VALUES

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...