资源论文Reward Learning for Efficient Reinforcement Learning in Extractive Document Summarisation

Reward Learning for Efficient Reinforcement Learning in Extractive Document Summarisation

2019-09-30 | |  63 |   39 |   0

Abstract Document summarisation can be formulated as a sequential decision-making problem, which can be solved by Reinforcement Learning (RL) algorithms. The predominant RL paradigm for summarisation learns a cross-input policy, which requires considerable time, data and parameter tuning due to the huge search spaces and the delayed rewards. Learning input-specifific RL policies is a more effificient alternative but so far depends on handcrafted rewards, which are diffificult to design and yield poor performance. We propose RELIS, a novel RL paradigm that learns a reward function with Learning-to-Rank (L2R) algorithms at training time and uses this reward function to train an input-specifific RL policy at test time. We prove that RELIS guarantees to generate near-optimal summaries with appropriate L2R and RL algorithms. Empirically, we evaluate our approach on extractive multi-document summarisation. We show that RELIS reduces the training time by two orders of magnitude compared to the state-of-the-art models while performing on par with them

上一篇:Hybrid Actor-Critic Reinforcement Learning in Parameterized Action Space

下一篇:An Actor-Critic-Attention Mechanism for Deep Reinforcement Learning in Multi-view Environments

用户评价
全部评价

热门资源

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...