Abstract
Sequence-processing neural networks led to
remarkable progress on many NLP tasks. As
a consequence, there has been increasing interest in understanding to what extent they
process language as humans do. We aim
here to uncover which biases such models
display with respect to “natural” word-order
constraints. We train models to communicate about paths in a simple gridworld, using miniature languages that reflect or violate
various natural language trends, such as the
tendency to avoid redundancy or to minimize
long-distance dependencies. We study how
the controlled characteristics of our miniature
languages affect individual learning and their
stability across multiple network generations.
The results draw a mixed picture. On the one
hand, neural networks show a strong tendency
to avoid long-distance dependencies. On the
other hand, there is no clear preference for the
efficient, non-redundant encoding of information that is widely attested in natural language.
We thus suggest inoculating a notion of “effort” into neural networks, as a possible way
to make their linguistic behavior more humanlike