Value Function Transfer for Deep Multi-Agent Reinforcement Learning Based on N-Step Returns

资源分类

2019-09-29 |

142 |

57 |

Abstract Many real-world problems, such as robot control and soccer game, are naturally modeled as sparse-interaction multi-agent systems. Reutilizing single-agent knowledge in multi-agent systems with sparse interactions can greatly accelerate the multi-agent learning process. Previous works rely on bisimulation metric to defifine Markov decision process (MDP) similarity for controlling knowledge transfer. However, bisimulation metric is costly to compute and is not suitable for highdimensional state space problems. In this work, we propose more scalable transfer learning methods based on a novel MDP similarity concept. We start by defifining the MDP similarity based on the N-step return (NSR) values of an MDP. Then, we propose two knowledge transfer methods based on deep neural networks called direct value function transfer and NSR-based value function transfer. We conduct experiments in image-based grid world, multi-agent particle environment (MPE) and Ms. Pac-Man game. The results indicate that the proposed methods can signifificantly accelerate multi-agent reinforcement learning and meanwhile get better asymptotic performance

上一篇：Large-Scale Home Energy Management Using Entropy-Based Collective Multiagent Deep Reinforcement Learning Framework

下一篇：Learning Unsupervised Visual Grounding Through Semantic Self-Supervision

用户评价

全部评价

还没有评论，说两句吧！

热门资源

The Variational S...

Unlike traditional images which do not offer in...
Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Joint Pose and Ex...

Facial expression recognition (FER) is a challe...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com