资源数据集ALOV++ 物体追踪视频数据

ALOV++ 物体追踪视频数据

2019-11-15 | |  109 |   0 |   0

Amsterdam Library of Ordinary Videos for tracking, ALOV++, aimes to cover as diverse circumstances as possible: illuminations, transparency, specularity, confusion with similar objects, clutter, occlusion, zoom, severe shape changes, different motion patterns,low contrast, and so on. In composing the ALOV++ dataset, preference was given to many assorted short videos over a few longer ones. In each of these aspects, we collect video sequences ranging from easy to difficult with the emphasis on difficult video. ALOV++ is also composed to be upward compatible with other benchmarks for tracking by including 11 standard tracking video sequences from existing datasets for the aspects which cover smoothness and occlusion. Additionally, we have selected 11 standard video sequences frequently used in recent tracking papers, on the aspects of light, albedo, transparency, motion smoothness, confusion, occlusion and shaking camera. 65 Sequences have been reported earlier in the PETS workshop, and 250 are new, for a total of 315 video sequences.

The main source of the data is real-life videos from YouTube with 64 different types of targets ranging from human face, a person, a ball, an octopus, microscopic cells, a plastic bag or a can. The collection is categorized for thirteen aspects of difficulty with many hard to very hard videos, like a dancer, a rock singer in a concert, complete transparent glass, octopus, flock of birds, soldier in camouflage, completely occluded object and videos with extreme zooming introducing abrupt motion of targets.

To maximize the diversity, most of the sequences are short. The average length of the short videos is 9.2 seconds with a maximum of 35 seconds. One additional category contains ten long videos with a duration between one and two minutes. The total number of frames in ALOV++ is 89364. The data in ALOV++ are annotated by a rectangular bounding box along the main axes of flexible size every fifth frame. In rare cases, when motion is rapid, the annotation is more frequent. The ground truth has been acquired for the intermediate frames by linear interpolation. The ground truth bounding box in the first frame is specified to the trackers.

上一篇:Motion Capture 动作捕捉视频数据

下一篇:UCF Google Street View 图像数据

用户评价
全部评价

热门资源

  • GRAZ 图像分类数据

    GRAZ 图像分类数据

  • MIT Cars 汽车图像...

    MIT Cars 汽车图像数据

  • 凶杀案报告数据

    凶杀案报告数据

  • 猫和狗图像分类数...

    Kaggle 上的竞赛数据,用以区分猫和狗两类对象,...

  • Bosch 流水线降低...

    数据来自产品在Bosch真实生产线上制造过程中的设备...