Street Scences
The StreetScenes Challenge Framework is a collection of images, annotations, software and performance measures for object detection. Each image was taken from a DSC-F717 camera at in and around Boston, MA. Each image was then labeled by hand with polygons surrounding each example of 9 object categories, including [cars, pedestrians, bicycles, buildings, trees, skies, roads, sidewalks, and stores]. The labeling of these images was done under scrutiny to ensure that objects were always labeled in the same way, with regard to occlusions and other common image transforms. StreetScenes labels are also compatible with LabelMe annotations, with a one to one conversion tool, provided here. For more information on the collection of the data, see Stanely Bileschi’s Thesis.
These examples show some sample images from the database at a reduced resolution. The Label images show the types of labels in the database. In these illustrations, the polygons are opaque and colored, but in the label files, the polygons are listed as a simple list of corners. Also these images show bounding boxes around pedestrians and cars to make the illustration simple, but the database includes actual polygons around these objects as well.
The purpose of this framework is to provide modular access to a complete detection framework so that development my proceed on any part independently. A researcher may build a better learning architecture, or a better feature, without having to engineer the rest of the system.
Three separate object detection measures were developed to measure performance in the StreetScenes Challenge Framework, i.e. the Crop wise detection measure, the Point Wise detection measure, and the Bounding Box wise detection measure. These measures have complementary facilities, operating where the others don’t make sense.
Object Detection Models
Crop-Wise Object Detection
Crop-Wise object detection is a simple and common way of measuring the power of an object detection system. In this method, small crops of positive and negative examples of the target object category are first extracted from the larger images. For instance, positive car images would contain nicely cropped images of cars, while negative car images would contain anything but cars. These images are represented mathematically somehow, e.g. with wavelets or histograms of gradients or whatever, and then a statistical learning machine is employed to learn a classifier between the two sets. In order to measure the efficacy of the learned detector, part of the training set is reserved to measure the performance (I prefer to use about one third). Repeating this training/testing split several times gives a statistically significant measure of crop-wise object detection.
Images | Extraction | Extracted | Feature | Pos/Neg | Learning Validation Code | ROC Curves |
Annotations |
General Matlab Support Code
Point-Wise Object Detection
Point-Wise object detection is similar to crop-wise object detection, except that rather than classifying boxes which fit around the object of interest, instead we classify points (and their neighborhoods) inside the object. In this method, a positive set and negative point set is selected (i.e. points inside and outside of the object). At each of these points, a mathematical feature is extracted, which in general depends on patterns of brightness and color in the neighborhood of the point. Once these features have been extracted, learning and testing occur as in crop-wise object detection.
Individual Downloads not available yet.
Bounding Box-Wise Object Detection
Bounding Box-Wise object detection the measure closest to actually running a useful object detection system on these types of scenes. In this method, an object detector is trained, as in crop-wise detection, but then applied to a reserved set of test images at multiple positions and scales. The response of the detector is fed to a local-neighborhood suppression algorithm, which outputs a set of positions and confidences within the test set for possible object existence. This set is then compared to the human benchmark positions, and detections which are close enough in position and scale are called true detections. Using this data, a precision-recall curve is drawn to measure the total system performance.
Individual Downloads not available yet.
上一篇:NUS-WIDE
下一篇:tinyimages
还没有评论,说两句吧!
热门资源
GRAZ 图像分类数据
GRAZ 图像分类数据
MIT Cars 汽车图像...
MIT Cars 汽车图像数据
凶杀案报告数据
凶杀案报告数据
猫和狗图像分类数...
Kaggle 上的竞赛数据,用以区分猫和狗两类对象,...
Bosch 流水线降低...
数据来自产品在Bosch真实生产线上制造过程中的设备...
智能在线
400-630-6780
聆听.建议反馈
E-mail: support@tusaishared.com