Abstract
Recently, researchers have made great processes to build
category-specific 3D shape models from 2D images with
manual annotations consisting of class labels, keypoints,
and ground truth figure-ground segmentations. However,
the annotation of figure-ground segmentations is still laborintensive and time-consuming. To further alleviate the burden of providing such manual annotations, we make the earliest effort to learn category-specific 3D shape models by
only using weakly labeled 2D images. By revealing the underlying relationship between the tasks of common object
segmentation and category-specific 3D shape reconstruction, we propose a novel framework to jointly solve these
two problems along a cluster-level learning curriculum.
Comprehensive experiments on the challenging PASCAL
VOC benchmark demonstrate that the category-specific 3D
shape models trained using our weakly supervised learning
framework could, to some extent, approach the performance
of the state-of-the-art methods using expensive manual segmentation annotations. In addition, the experiments also
demonstrate the effectiveness of using 3D shape models for
helping common object segmentation