资源数据集Buffy Stickmen V3 人体轮廓识别图像数据

Buffy Stickmen V3 人体轮廓识别图像数据

2019-11-16 | |  123 |   0 |   0

Annotated example

      Annotated example

      

Overview

High quality ground-truth annotations for 2D human pose layout (e.g. HumanEva) are typically acquired in artificial laboratory settings, and the task is often simplified by having static backgrounds, well centered persons, and high contrast clothing.

We release here a dataset of unconstrained images with associated ground-truth stickmen annotations. The data comes from the TV show Buffy the Vampire Slayer and it is very challenging: persons appear at a variety of scales, against highly cluttered background, and wear any kind of clothing.
For each imaged person, we provide line segments indicating location, size and orientation of six body parts (head, torso, upper/lower right/left arms). In each annotated frame exactly one person is annotated. The package includes a total of 748 annotated video frames over 5 episodes of the fifth season of BTVS. Our results on the test subset from this dataset (three episodes, in total 276 frames) are published in [1-5].

Below, the scatter plot inspired by [6] depicts pose variability over this dataset. Stickmen are centered on the neck and scale normalized. Hence the plot captures only pose variability and does not show scale and location variability. Note the higher variability of poses in the test set than what reported in fig. 2 of [6], which shows the scatter plots for episode 5 only.

image.png


In addition, the package includes official Matlab routines to evaluate the performance of your pose estimation system on this dataset. These routines implement the protocol of [4,5], and hence allow an exact comparison to our results.

Clarification of the PCP evaluation criterion

The matlab code to evaluate PCP provided with this dataset represents the official evaluation protocol for the following datasets: Buffy Stickmen, ETHZ PASCAL Stickmen, We Are Family Stickmen. In our PCP implementation, a body part produced by an algorithm is considered correctly localized if its endpoints are closer to their ground-truth locations than a threshold (on average over the two endpoints). Using it ensures results comparable to the vast majority of results previously reported on these dataset.

Recently an alternative implementation of the PCP criterion, based on a stricter interpretation of its description in Ferrari et al CVPR 2008 has been used in some works, including Johnson et al. BMVC 2010 and Pishchulin et al CVPR 2012. In this implementation, a body part is considered correct only if both of its endpoints are closer to their ground-truth locations than a threshold. These two different PCP measures are the consequence of the ambiguous wording in the original verbal description of PCP in Ferrari et al CVPR 2008 (which did not mention averaging over endpoints). Importantly, the stricter PCP version has essentially been used only on other datasets than the ones mentioned above, and in particular on IIP (Iterative Image Parsing dataset, Ramanan NIPS 2006) and LSP (Leeds Sports Pose dataset, Johnson et al. BMVC 2010).

In order to keep a healthy research environment and guarantee the comparability of results across different research groups and different years, we recommend the following policy:

  • use our evaluation code, which computes the original, looser PCP measure, on Buffy Stickmen, ETHZ PASCAL Stickmen, We Are Family Stickmen, i.e. essentially on all datasets released by us

  • some other datasets unfortunately have no official evaluation code released with them, and therefore it is harder to establish an exact and fully official protocol. Nonetheless, based on the protocols followed by most papers that appeared so far, we recommend using the strict PCP measure on IIP and LSP. A precise definition of the strict PCP measure can be found in Pishchulin et al CVPR 2012.

D. Ramanan. "Learning to Parse Images of Articulated Objects", In NIPS, 2006.
S. Johnson and M. Everingham "Clustered Pose and Nonlinear Appearance Models for Human Pose Estimation", In BMVC, 2010
L. Pishchulin, A. Jain, M. Andriluka, T. Thormaehlen and B. Schiele "Articulated People Detection and Pose Estimation: Reshaping the Future", In CVPR, 2012

New detection windows

In the current release we provide new detection windows obtained using the Calvin upper-body detector, which yield 95% detection rate on the test set. This is 10% higher compared to the previously released detections produced by the old VGG upper-body detector and used in [4]. This new detection windows have the important advantage of enabling to evaluate pose estimation performance on a greater coverage of the test set.

You can use these new detection windows as an input to your own human pose estimator, to ensure an exact comparison to [5] in terms of pose estimation performance. Along with the new detection windows we also provide our new pose estimates [5].

For reference, the package still includes the old detections and pose estimates from [4]. However, these are now obsolete and should not be used.


上一篇:Text Recognition 文字图像数据

下一篇:Human Pose Evaluator 人体轮廓识别图像数据

用户评价
全部评价

热门资源

  • GRAZ 图像分类数据

    GRAZ 图像分类数据

  • MIT Cars 汽车图像...

    MIT Cars 汽车图像数据

  • 凶杀案报告数据

    凶杀案报告数据

  • 猫和狗图像分类数...

    Kaggle 上的竞赛数据,用以区分猫和狗两类对象,...

  • Bosch 流水线降低...

    数据来自产品在Bosch真实生产线上制造过程中的设备...