Abstract
While conditional language models have
greatly improved in their ability to output
high-quality natural language, many NLP applications benefit from being able to generate
a diverse set of candidate sequences. Diverse
decoding strategies aim to, within a givensized candidate list, cover as much of the space
of high-quality outputs as possible, leading to
improvements for tasks that re-rank and combine candidate outputs. Standard decoding
methods, such as beam search, optimize for
generating high likelihood sequences rather
than diverse ones, though recent work has focused on increasing diversity in these methods. In this work, we perform an extensive
survey of decoding-time strategies for generating diverse outputs from conditional language
models. We also show how diversity can be
improved without sacrificing quality by oversampling additional candidates, then filtering
to the desired number.