Abstract
A video dataset that is designed to study fine-grained cat-egorisation of pedestrians is introduced. Pedestrians wererecorded “in-the-wild” from a moving vehicle. Annotationsinclude bounding boxes, tracks, 14 keypoints with occlu-sion information and the fine-grained categories of age (5classes), sex (2 classes), weight (3 classes) and clothingstyle (4 classes). There are a total of 27,454 bounding boxand pose labels across 4222 tracks. This dataset is designedto train and test algorithms for fine-grained categorisationof people; it is also useful for benchmarking tracking, detec-tion and pose estimation of pedestrians. State-of-the-art al-gorithms for fine-grained classification and pose estimationwere tested using the dataset and the results are reported asa useful performance baseline.