Abstract
In this paper, a novel Generation-Evaluation
framework is developed for multi-turn conversations with the objective of letting both participants know more about each other. For the
sake of rational knowledge utilization and coherent conversation flow, a dialogue strategy
which controls knowledge selection is instantiated and continuously adapted via reinforcement learning. Under the deployed strategy,
knowledge grounded conversations are conducted with two dialogue agents. The generated dialogues are comprehensively evaluated
on aspects like informativeness and coherence,
which are aligned with our objective and human instinct. These assessments are integrated
as a compound reward to guide the evolution
of dialogue strategy via policy gradient. Comprehensive experiments have been carried out
on the publicly available dataset, demonstrating that the proposed method outperforms the
other state-of-the-art approaches significantly