Face Alignment by MobileNetv2. Note that MTCNN is used to provided the input boundingbox. You need to modify the path of images in order to run the demo.
Network Structure
The most important part of the mobilenet-v2 network is the design of bottleneck. In our experiments, we crop the face image by the boundingbox and resize it to , which is the input size of the network. Based on this, we can design the structure of our customized mobilenet-v2 for facial landmark lacalization. Note that the receptive field is a key factor to the design of the network.
Input
Operator
t
channels
n
stride
conv2d
-
16
1
2
bottleneck
6
24
1
2
conv2d
6
24
1
1
conv2d
6
32
1
2
conv2d
6
32
1
1
conv2d
6
64
1
2
conv2d
6
64
1
1
inner product
-
200
1
-
200
inner product
-
200
1
-
200
inner product
-
50
1
-
50
inner product
-
136
1
-
Note that this structure mainly has two features:
Use LeakyReLU rather than ReLU.
Use bottleneck embedding, which is 50 in our experiments.
Training
The training data including:
Training data of 300W dataset
Training data of Menpo dataset
Data Augmentation
Data augmentation is important to the performance of face alignment. I have tried several kinds of data augmentation method, including:
Random Flip.
Random Shift.
Random Scale.
Random Rotation. The image is rotated by the degree sampled from -30 to 30.
Random Noise. Gaussian noise is added to the input images.
Performance
The performance on 300W is not good enough. May be I need to try more times. If you have any ideas, please contact me or open an issue.
Method
Input Size
Common
Challenge
Full set
Training Data
VGG-Shadow(With Dropout)
70 * 60
5.66
10.82
6.67
300W
Mobilenet-v2-stage1
64 * 64
6.07
10.60
6.96
300W and Menpo
Mobilenet-v2-stage2
64 * 64
5.76
8.93
6.39
300W and Menpo
Dataset
Dataset
Number of images for training
300-W
3148
Menpo
12006
Result on 300W
The ground truth landmarks is donated by white color while the predicted ones blue.
I write a demo to view the alignment results. Besides, the yaw, row and pitch parameters are estimated by the predicted landmarks. To run the domo, please do: