Abstract
We introduce Intelligent Annotation Dialogs for bounding
box annotation. We train an agent to automatically choose
a sequence of actions for a human annotator to produce
a bounding box in a minimal amount of time. Specifically,
we consider two actions: box verification [34], where the
annotator verifies a box generated by an object detector, and
manual box drawing. We explore two kinds of agents, one
based on predicting the probability that a box will be positively verified, and the other based on reinforcement learning.
We demonstrate that (1) our agents are able to learn efficient
annotation strategies in several scenarios, automatically
adapting to the image difficulty, the desired quality of the
boxes, and the detector strength; (2) in all scenarios the resulting annotation dialogs speed up annotation compared to
manual box drawing alone and box verification alone, while
also outperforming any fixed combination of verification and
drawing in most scenarios; (3) in a realistic scenario where
the detector is iteratively re-trained, our agents evolve a
series of strategies that reflect the shifting trade-off between
verification and drawing as the detector grows stronger