资源论文Learning from Dialogue after Deployment: Feed Yourself, Chatbot!

Learning from Dialogue after Deployment: Feed Yourself, Chatbot!

2019-09-23 | |  112 |   61 |   0 0 0
Abstract The majority of conversations a dialogue agent sees over its lifetime occur after it has already been trained and deployed, leaving a vast store of potential training signal untapped. In this work, we propose the self-feeding chatbot, a dialogue agent with the ability to extract new training examples from the conversations it participates in. As our agent engages in conversation, it also estimates user satisfaction in its responses. When the conversation appears to be going well, the user’s responses become new training examples to imitate. When the agent believes it has made a mistake, it asks for feedback; learning to predict the feedback that will be given improves the chatbot’s dialogue abilities further. On the PERSONACHAT chitchat dataset with over 131k training examples, we find that learning from dialogue with a selffeeding chatbot significantly improves performance, regardless of the amount of traditional supervision

上一篇:Learning a Matching Model with Co-teaching for Multi-turn Response Selection in Retrieval-based Dialogue Systems

下一篇:MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...