Abstract
For modeling the 3D world behind 2D images, which
3D representation is most appropriate? A polygon mesh
is a promising candidate for its compactness and geometric
properties. However, it is not straightforward to model a
polygon mesh from 2D images using neural networks because the conversion from a mesh to an image, or rendering, involves a discrete operation called rasterization,
which prevents back-propagation. Therefore, in this work,
we propose an approximate gradient for rasterization that
enables the integration of rendering into neural networks.
Using this renderer, we perform single-image 3D mesh reconstruction with silhouette image supervision and our system outperforms the existing voxel-based approach. Additionally, we perform gradient-based 3D mesh editing operations, such as 2D-to-3D style transfer and 3D DeepDream,
with 2D supervision for the first time. These applications
demonstrate the potential of the integration of a mesh renderer into neural networks and the effectiveness of our proposed renderer.