CNNs found to jump around more skillfully than RNNs:
Compositional generalization in seq2seq convolutional networks
Abstract
Lake and Baroni (2018) introduced the SCAN
dataset probing the ability of seq2seq models
to capture compositional generalizations, such
as inferring the meaning of “jump around” 0-
shot from the component words. Recurrent
networks (RNNs) were found to completely
fail the most challenging generalization cases.
We test here a convolutional network (CNN)
on these tasks, reporting hugely improved performance with respect to RNNs. Despite the
big improvement, the CNN has however not
induced systematic rules, suggesting that the
difference between compositional and noncompositional behaviour is not clear-cut