Abstract
Weakly supervised object detection (WSOD) has
been widely studied but the accuracy of state-of-art
methods remains far lower than strongly supervised
methods. One major reason for this huge gap is
the incomplete box detection problem which arises
because most previous WSOD models are structured on classification networks and therefore tend
to recognize the most discriminative parts instead
of complete bounding boxes. To solve this problem, we define a low-shot weakly supervised object
detection task and propose a novel low-shot box
correction network to address it. The proposed task
enables to train object detectors on a large data set
all of which have image-level annotations, but only
a small portion or few shots have box annotations.
Given the low-shot box annotations, we use a novel
box correction network to transfer the incomplete
boxes into complete ones. Extensive empirical evidence shows that our proposed method yields stateof-art detection accuracy under various settings on
the PASCAL VOC benchmark