CoDraw: Collaborative Drawing as a Testbed for
Grounded Goal-driven Communication
Abstract
In this work, we propose a goal-driven collaborative task that combines language, perception, and action. Specifically, we develop a
Collaborative image-Drawing game between
two agents, called CoDraw. Our game is
grounded in a virtual world that contains movable clip art objects. The game involves two
players: a Teller and a Drawer. The Teller
sees an abstract scene containing multiple clip
art pieces in a semantically meaningful configuration, while the Drawer tries to reconstruct
the scene on an empty canvas using available clip art pieces. The two players communicate with each other using natural language. We collect the CoDraw dataset of
?10K dialogs consisting of ?138K messages
exchanged between human players. We de-
fine protocols and metrics to evaluate learned
agents in this testbed, highlighting the need for
a novel crosstalk evaluation condition which
pairs agents trained independently on disjoint
subsets of the training data. We present models
for our task and benchmark them using both
fully automated evaluation and by having them
play the game live with humans