资源数据集Street Scences

Street Scences

2020-02-06 | |  79 |   0 |   0

The StreetScenes Challenge Framework is a collection of images, annotations, software and performance measures for object detection.  Each image was taken from a DSC-F717 camera at in and around BostonMA.  Each image was then labeled by hand with polygons surrounding each example of 9 object categories, including [cars, pedestrians, bicycles, buildings, trees, skies, roads, sidewalks, and stores].  The labeling of these images was done under scrutiny to ensure that objects were always labeled in the same way, with regard to occlusions and other common image transforms.   StreetScenes labels are also compatible with LabelMe annotations, with a one to one conversion tool, provided here.  For more information on the collection of the data, see Stanely Bileschi’s Thesis.

 

These examples show some sample images from the database at a reduced resolution.  The Label images show the types of labels in the database.  In these illustrations, the polygons are opaque and colored, but in the label files, the polygons are listed as a simple list of corners.  Also these images show bounding boxes around pedestrians and cars to make the illustration simple, but the database includes actual polygons around these objects as well.

 

            The purpose of this framework is to provide modular access to a complete detection framework so that development my proceed on any part independently.  A researcher may build a better learning architecture, or a better feature, without having to engineer the rest of the system. 

 

            Three separate object detection measures were developed to measure performance in the StreetScenes Challenge Framework, i.e. the Crop wise detection measure, the Point Wise detection measure, and the Bounding Box wise detection measure.  These measures have complementary facilities, operating where the others don’t make sense.

 

Object Detection Models

 

Crop-Wise Object Detection

Crop-Wise object detection is a simple and common way of measuring the power of an object detection system.  In this method, small crops of positive and negative examples of the target object category are first extracted from the larger images.  For instance, positive car images would contain nicely cropped images of cars, while negative car images would contain anything but cars.  These images are represented mathematically somehow, e.g. with wavelets or histograms of gradients or whatever, and then a statistical learning machine is employed to learn a classifier between the two sets.  In order to measure the efficacy of the learned detector, part of the training set is reserved to measure the performance (I prefer to use about one third).  Repeating this training/testing split several times gives a statistically significant measure of crop-wise object detection.

                         

 

Images

Extraction
Code

Extracted
Image Crops

Feature
Generation
Code

Pos/Neg
Feature
Matricies

Learning
& Cross

Validation

Code

ROC

Curves

Annotations

General Matlab Support Code

 

Point-Wise Object Detection

Point-Wise object detection is similar to crop-wise object detection, except that rather than classifying boxes which fit around the object of interest, instead we classify points (and their neighborhoods) inside the object.  In this method, a positive set and negative point set is selected (i.e. points inside and outside of the object).  At each of these points, a mathematical feature is extracted, which in general depends on patterns of brightness and color in the neighborhood of the point.  Once these features have been extracted, learning and testing occur as in crop-wise object detection.

 

Individual Downloads not available yet.

 

Bounding Box-Wise Object Detection

Bounding Box-Wise object detection the measure closest to actually running a useful object detection system on these types of scenes.  In this method, an object detector is trained, as in crop-wise detection, but then applied to a reserved set of test images at multiple positions and scales.  The response of the detector is fed to a local-neighborhood suppression algorithm, which outputs a set of positions and confidences within the test set for possible object existence.  This set is then compared to the human benchmark positions, and detections which are close enough in position and scale are called true detections.  Using this data, a precision-recall curve is drawn to measure the total system performance.

 

Individual Downloads not available yet.


上一篇:NUS-WIDE

下一篇:tinyimages

用户评价
全部评价

热门资源

  • GRAZ 图像分类数据

    GRAZ 图像分类数据

  • MIT Cars 汽车图像...

    MIT Cars 汽车图像数据

  • 凶杀案报告数据

    凶杀案报告数据

  • 猫和狗图像分类数...

    Kaggle 上的竞赛数据,用以区分猫和狗两类对象,...

  • Bosch 流水线降低...

    数据来自产品在Bosch真实生产线上制造过程中的设备...