Abstract
Keyphrases, that concisely describe the highlevel topics discussed in a document, are very
useful for a wide range of natural language
processing tasks. Though existing keyphrase
generation methods have achieved remarkable
performance on this task, they generate many
overlapping phrases (including sub-phrases or
super-phrases) of keyphrases. In this paper, we
propose the parallel Seq2Seq network with the
coverage attention to alleviate the overlapping
phrase problem. Specifically, we integrate the
linguistic constraints of keyphrases into the
basic Seq2Seq network on the source side, and
employ the multi-task learning framework on
the target side. In addition, in order to prevent
from generating overlapping phrases with correct syntax, we introduce the coverage vector
to keep track of the attention history and to decide whether the parts of source text have been
covered by existing generated keyphrases. The
experimental results show that our method can
outperform the state-of-the-art CopyRNN on
scientific datasets, and is also more effective
in news domain.