Robust Facial Landmark Detection
via a Fully-Convolutional Local-Global Context Network
Abstract
While fully-convolutional neural networks are very
strong at modeling local features, they fail to aggregate
global context due to their constrained receptive field. Modern methods typically address the lack of global context
by introducing cascades, pooling, or by fitting a statistical model. In this work, we propose a new approach that
introduces global context into a fully-convolutional neural
network directly. The key concept is an implicit kernel convolution within the network. The kernel convolution blurs
the output of a local-context subnet, which is then refined
by a global-context subnet using dilated convolutions. The
kernel convolution is crucial for the convergence of the network because it smoothens the gradients and reduces over-
fitting. In a postprocessing step, a simple PCA-based 2D
shape model is fitted to the network output in order to filter
outliers. Our experiments demonstrate the effectiveness of
our approach, outperforming several state-of-the-art methods in facial landmark detection