Description
The dataset contains images of people collected from the web by typing common given names into Google Image Search. The coordinates of the eyes, the nose and the center of the mouth for each frontal face are provided in a ground truth file. This information can be used to align and crop the human faces or as a ground truth for a face detection algorithm. The dataset has 10,524 human faces of various resolutions and in different settings, e.g. portrait images, groups of people, etc. Profile faces or very low resolution faces are not labeled.
Resources
Here are a number of useful matlab scripts for this data:
A matlab script which displays the images is here.
The data contains a total of 10,524 faces in 7,092 images. The average image resolution is 304x312 pixels across the data. Here is a script which displays image resolution statistics.
The statistics of the resolution of faces present in the dataset are presented below. The matlab script used to generate them is here.
The data has a number of duplicate images. These duplicate images were distributed among different people to provide ground truth and can be used to evaluate reliability and precision of the manually generated ground truth. Here is a list of the images which we believe to be duplicates and here is a script which identifies them.