Abstract. Reconstructing 3D facial geometry from a single RGB image has recently instigated wide research interest. However, it is still an ill-posed problem
and most methods rely on prior models hence undermining the accuracy of the
recovered 3D faces. In this paper, we exploit the Epipolar Plane Images (EPI)
obtained from light field cameras and learn CNN models that recover horizontal
and vertical 3D facial curves from the respective horizontal and vertical EPIs.
Our 3D face reconstruction network (FaceLFnet) comprises a densely connected
architecture to learn accurate 3D facial curves from low resolution EPIs. To train
the proposed FaceLFnets from scratch, we synthesize photo-realistic light field
images from 3D facial scans. The curve by curve 3D face estimation approach
allows the networks to learn from only 14K images of 80 identities, which still
comprises over 11 Million EPIs/curves. The estimated facial curves are merged
into a single pointcloud to which a surface is fitted to get the final 3D face. Our
method is model-free, requires only a few training samples to learn FaceLFnet
and can reconstruct 3D faces with high accuracy from single light field images
under varying poses, expressions and lighting conditions. Comparison on the BU-
3DFE and BU-4DFE datasets show that our method reduces reconstruction errors
by over 20% compared to recent state of the art