Towards Human-Machine Cooperation:
Self-supervised Sample Mining for Object Detection
Abstract
Though quite challenging, leveraging large-scale unlabeled or partially labeled images in a cost-effective way
has increasingly attracted interests for its great importance
to computer vision. To tackle this problem, many Active
Learning (AL) methods have been developed. However,
these methods mainly define their sample selection criteria within a single image context, leading to the suboptimal robustness and impractical solution for large-scale object detection. In this paper, aiming to remedy the drawbacks of existing AL methods, we present a principled
Self-supervised Sample Mining (SSM) process accounting
for the real challenges in object detection. Specifically,
our SSM process concentrates on automatically discovering
and pseudo-labeling reliable region proposals for enhancing the object detector via the introduced cross image validation, i.e., pasting these proposals into different labeled
images to comprehensively measure their values under different image contexts. By resorting to the SSM process, we
propose a new AL framework for gradually incorporating
unlabeled or partially labeled data into the model learning
while minimizing the annotating effort of users. Extensive
experiments on two public benchmarks clearly demonstrate
our proposed framework can achieve the comparable performance to the state-of-the-art methods with significantly
fewer annotations