资源数据集LabelMe 12-50k 数据集原始数据

LabelMe 12-50k 数据集原始数据

2019-09-18 | |  146 |   0 |   0

Description: 

The LabelMe-12-50k dataset consists of 50,000 JPEG images (40,000 for training and 10,000 for testing), which were extracted from LabelMe [1]. Each image is 256x256 pixels in size. 50% of the images in the training and testing set show a centered object, each belonging to one of the 12 object classes shown in Table 1. The remaining 50% show a randomly selected region of a randomly selected image ("clutter").

The dataset is a quite difficult challenge for object recognition systems because the instances of each object class vary greatly in appearance, lighting conditions, and angles of view. Furthermore, centered objects may be partly occluded or other objects (or parts of them) may be present in the image. See [1] for a more detailed descripton of the dataset.

Table 1: Object classes and number of instances in the LabelMe-12-50k dataset


#Object classInstances in
training set
Instances in
testing set
1person4,8851,180
2car3,829974
3building2,085531
4window4,0971,028
5tree1,846494
6sign954249
7door830178
8bookshelf391100
9chair38588
10table19254
11keyboard32475
12head21249

clutter20,0005,000

total number of images40,00010,000


Annotation format: 

The dataset archive contains annotation files in two formats:

  • Human-readable text files (annotation-train.txt and annotation-test.txt), which contain in each line an image file name (without the .jpg extension) and 12 class labels corresponding to the 12 object classes.

  • Binary files (annotation-train.bin and annotation-test.bin), which contain 12 successive 32-bit float values for each image, each value representing the class label of the corresponding class. The file does not contain any meta information (e.g., there is no header).


The annotation label values of the two file formats differ slightly because the values in the text files are rounded to the second decimal place. If you want to report recognition rates, you should use the binary annotation files for training and testing because of the more precise label values.

All label values are between -1.0 and 1.0. For the 50% of non-clutter images, the label of the depicted object is set to 1.0. As instances of other object classes may also be present in the image (in object images as well as in clutter images), the other labels either have a value of -1.0 or a value between 0.0 and 1.0. A value of -1.0 is set either if no instance of the object class is present in the image or if the level of overlapping (calculated by the size and position of the object's bounding box) is below a certain threshold. Values above 0.0 are assigned if this threshold is exceeded. A value of 1.0 means that the corresponding object is exactly centered in the image and 160 pixels in size (in its larger dimension), just like the extracted objects.


Recognition rates: 

Currently, the only results shown in Table 2 are from our paper [1]. If you would like to report recognition rates, please send them to uetz _at_ ais.uni-bonn.de, including a link to your publication or a description of the method you used.

Table 2: Training and testing error rates on the LabelMe-12-50k dataset


Method usedTraining error rateTesting error rateReported by...
Locally-connected Neural Pyramid3.77%16.27%Uetz and Behnke 2009 [1]


上一篇:COIL-20 数据集

下一篇:INRIA Person Dataset 原始数据

用户评价
全部评价

热门资源

  • GRAZ 图像分类数据

    GRAZ 图像分类数据

  • MIT Cars 汽车图像...

    MIT Cars 汽车图像数据

  • 凶杀案报告数据

    凶杀案报告数据

  • 猫和狗图像分类数...

    Kaggle 上的竞赛数据,用以区分猫和狗两类对象,...

  • Bosch 流水线降低...

    数据来自产品在Bosch真实生产线上制造过程中的设备...