资源论文Using Cases as Heuristics in Reinforcement Learning: A Transfer Learning Application

Using Cases as Heuristics in Reinforcement Learning: A Transfer Learning Application

2019-11-12 | |  61 |   41 |   0
Abstract In this paper we propose to combine three AI techniques to speed up a Reinforcement Learning algorithm in a Transfer Learning problem: Casebased Reasoning, Heuristically Accelerated Reinforcement Learning and Neural Networks. To do so, we propose a new algorithm, called L3, which works in 3 stages: in the ?rst stage, it uses Reinforcement Learning to learn how to perform one task, and stores the optimal policy for this problem as a case-base; in the second stage, it uses a Neural Network to map actions from one domain to actions in the other domain and; in the third stage, it uses the case-base learned in the ?rst stage as heuristics to speed up the learning performance in a related, but different, task. The RL algorithm used in the ?rst phase is the Q-learning and in the third phase is the recently proposed Case-based Heuristically Accelerated Q-learning. A set of empirical evaluations were conducted in transferring the learning between two domains, the Acrobot and the Robocup 3D: the policy learned during the solution of the Acrobot Problem is transferred and used to speed up the learning of stability policies for a humanoid robot in the Robocup 3D simulator. The results show that the use of this algorithm can lead to a signi?cant improvement in the performance of the agent.

上一篇:Distance Metric Learning under Covariate Shift

下一篇:Increasing the Scalability of the Fitting of Generalised Block Models for Social Networks

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...