Pre-Learning Environment Representations for Data-Efficient Neural
Instruction Following
Abstract
We consider the problem of learning to map
from natural language instructions to state
transitions (actions) in a data-efficient manner. Our method takes inspiration from the
idea that it should be easier to ground language
to concepts that have already been formed
through pre-linguistic observation. We augment a baseline instruction-following learner
with an initial environment-learning phase
that uses observations of language-free state
transitions to induce a suitable latent representation of actions before processing the
instruction-following training data. We show
that mapping to pre-learned representations
substantially improves performance over systems whose representations are learned from
limited instructional data alone.