Abstract
Monocular 3D facial shape reconstruction from a single
2D facial image has been an active research area due to its
wide applications. Inspired by the success of deep neural
networks (DNN), we propose a DNN-based approach for
End-to-End 3D FAce Reconstruction (UH-E2FAR) from a
single 2D image. Different from recent works that reconstruct and refine the 3D face in an iterative manner using
both an RGB image and an initial 3D facial shape rendering, our DNN model is end-to-end, and thus the complicated 3D rendering process can be avoided. Moreover, we
integrate in the DNN architecture two components, namely
a multi-task loss function and a fusion convolutional neural
network (CNN) to improve facial expression reconstruction.
With the multi-task loss function, 3D face reconstruction is
divided into neutral 3D facial shape reconstruction and expressive 3D facial shape reconstruction. The neutral 3D
facial shape is class-specific. Therefore, higher layer features are useful. In comparison, the expressive 3D facial
shape favors lower or intermediate layer features. With the
fusion-CNN, features from different intermediate layers are
fused and transformed for predicting the 3D expressive facial shape. Through extensive experiments, we demonstrate
the superiority of our end-to-end framework in improving
the accuracy of 3D face reconstruction