资源数据集INRIA Person Dataset 原始数据

INRIA Person Dataset 原始数据

2019-09-18 | |  122 |   0 |   0

his dataset was collected as part of research work on detection of upright people in images and video. The research is described in detail in CVPR 2005 paper Histograms of Oriented Gradients for Human Detection and my PhD thesis. The dataset is divided in two formats: (a) original images with corresponding annotation files, and (b) positive images in normalized 64x128 pixel format (as used in the CVPR paper) with original negative images.

Contributions

The data set contains images from several different sources:

  • Images from GRAZ 01 dataset, though annotation files are completely new.

  • Images from personal digital image collections taken over a long time period. Usually the original positive images were of very high resolution (approx. 2592x1944 pixels), so we have cropped these images to highlight persons. Many people are bystanders taken from the backgrounds of these input photos, so ideally there is no particular bias in their pose.

  • Few of images are taken from the web using google images.

Note

  • Only upright persons (with person height > 100) are marked in each image.

  • Annotations may not be right; in particular at times portions of annotated bounding boxes may be outside or inside the object.

Original Images

Folders 'Train' and 'Test' correspond, respectively, to original training and test images. Both folders have three sub folders: (a) 'pos' (positive training or test images), (b) 'neg' (negative training or test images), and (c) 'annotations' (annotation files for positive images in Pascal Challenge format).

Normalized Images

Folders 'train_64x128_H96' and 'test_64x128_H96' correspond to normalized dataset as used in above referenced paper. Both folders have two sub folders: (a) 'pos' (normalized positive training or test images centered on the person with their left-right reflections), (b) 'neg' (containing original negative training or test images). Note images in folder 'train/pos' are of 96x160 pixels (a margin of 16 pixels around each side), and images in folder 'test/pos' are of 70x134 pixels (a margin of 3 pixels around each side). This has been done to avoid boundary conditions (thus to avoid any particular bias in the classifier). In both folders, use the centered 64x128 pixels window for original detection task.

Negative windows

To generate negative training windows from normalized images, a fixed set of 12180 windows (10 windows per negative image) are sampled randomly from 1218 negative training photos providing the initial negative training set. For each detector and parameter combination, a preliminary detector is trained and all negative training images are searched exhaustively (over a scale-space pyramid) for false positives (`hard examples'). All examples with score greater than zero are considered hard examples. The method is then re-trained using this augmented set (initial 12180 + hard examples) to produce the final detector. The set of hard examples is subsampled if necessary, so that the descriptors of the final training set fit into 1.7 GB of RAM for SVM training.

Starting scale in scale-space pyramid above is one and we keep adding one more level in the pyramid till floor(ImageWidth/Scale)>64 and floor(ImageHeight/Scale)>128. Scale ratio between two consecutive levels in the pyramid is 1.2. Window stride (sampling distance between two consecutive windows) at any scale is 8 pixels. If after fitting all windows at a scale level some margin remains at borders, we divide the margin by 2, take its floor and shift the whole window grid. For example, if image size at current level is (75,130), margin (with stride of 8 and window size of 64,128) left is (3,2). We shift all windows by (floor(MarginX/2), floor(MarginY/2)). New image width and height are calculated using the formulas: NewWidth=floor(OrigWidth/Scale) and NewHeight=floor(OrigHeight/Scale). Here scale=1 implies the original image size.

Also while testing negative images, to create negative windows, we use the same sampling structure.


上一篇:LabelMe 12-50k 数据集原始数据

下一篇:MIT CBCL Face Recognition Database

用户评价
全部评价

热门资源

  • GRAZ 图像分类数据

    GRAZ 图像分类数据

  • MIT Cars 汽车图像...

    MIT Cars 汽车图像数据

  • 凶杀案报告数据

    凶杀案报告数据

  • 猫和狗图像分类数...

    Kaggle 上的竞赛数据,用以区分猫和狗两类对象,...

  • Bosch 流水线降低...

    数据来自产品在Bosch真实生产线上制造过程中的设备...