Abstract
Object reconstruction from a single image – in the wild – isa problem where we can make progress and get meaningfulresults today. This is the main message of this paper, whichintroduces an automated pipeline with pixels as inputsand 3D surfaces of various rigid categories as outputs inimages of realistic scenes. At the core of our approachare deformable 3D models that can be learned from 2Dannotations available in existing object detection datasets,that can be driven by noisy automatic object segmentationsand which we complement with a bottom-up module for recovering high-frequency shape details. We perform a comprehensive quantitative analysis and ablation study of our approach using the recently introduced PASCAL 3D+ dataset and show very encouraging automatic reconstructions on PASCAL VOC.