Abstract
Discourse structure is integral to understanding a text and is helpful in many NLP tasks.
Learning latent representations of discourse
is an attractive alternative to acquiring expensive labeled discourse data. Liu and Lapata
(2018) propose a structured attention mechanism for text classification that derives a tree
over a text, akin to an RST discourse tree.
We examine this model in detail, and evaluate on additional discourse-relevant tasks and
datasets, in order to assess whether the structured attention improves performance on the
end task and whether it captures a text’s discourse structure. We find the learned latent
trees have little to no structure and instead focus on lexical cues; even after obtaining more
structured trees with proposed model modi-
fications, the trees are still far from capturing discourse structure when compared to discourse dependency trees from an existing discourse parser. Finally, ablation studies show
the structured attention provides little benefit,
sometimes even hurting performance.