Learning Latent Trees with Stochastic Perturbations
and Differentiable Dynamic Programming
Abstract
We treat projective dependency trees as latent
variables in our probabilistic model and induce
them in such a way as to be beneficial for a
downstream task, without relying on any direct
tree supervision. Our approach relies on Gumbel perturbations and differentiable dynamic
programming. Unlike previous approaches to
latent tree learning, we stochastically sample
global structures and our parser is fully differentiable. We illustrate its effectiveness on sentiment analysis and natural language inference
tasks. We also study its properties on a synthetic structure induction task. Ablation studies emphasize the importance of both stochasticity and constraining latent structures to be
projective trees.