Abstract
Unseen Action Recognition (UAR) aims to recognise
novel action categories without training examples. While
previous methods focus on inner-dataset seen/unseen splits,
this paper proposes a pipeline using a large-scale training source to achieve a Universal Representation (UR) that
can generalise to a more realistic Cross-Dataset UAR (CDUAR) scenario. We first address UAR as a Generalised
Multiple-Instance Learning (GMIL) problem and discover
‘building-blocks’ from the large-scale ActivityNet dataset
using distribution kernels. Essential visual and semantic
components are preserved in a shared space to achieve the
UR that can efficiently generalise to new datasets. Predicted UR exemplars can be improved by a simple semantic adaptation, and then an unseen action can be directly recognised using UR during the test. Without further training, extensive experiments manifest significant improvements over the UCF101 and HMDB51 benchmarks