Abstract
In this paper, we present a new perspective towards
image-based shape generation. Most existing deep learning
based shape reconstruction methods employ a single-view
deterministic model which is sometimes insufficient to determine a single groundtruth shape because the back part
is occluded. In this work, we first introduce a conditional
generative network to model the uncertainty for single-view
reconstruction. Then, we formulate the task of multi-view
reconstruction as taking the intersection of the predicted
shape spaces on each single image. We design new differentiable guidance including the front constraint, the diversity constraint, and the consistency loss to enable effective
single-view conditional generation and multi-view synthesis. Experimental results and ablation studies show that
our proposed approach outperforms state-of-the-art methods on 3D reconstruction test error and demonstrates its
generalization ability on real world data.