Abstract
Estimating dense visual correspondences between objects with intra-class variation, deformations and background clutter remains a challenging problem. Thanks to
the breakthrough of CNNs there are new powerful features
available. Despite their easy accessibility and great success, existing semantic flow methods could not significantly
benefit from these without extensive additional training.
We introduce a novel method for semantic matching with
pre-trained CNN features which is based on convolutional
feature pyramids and activation guided feature selection.
For the final matching we propose a sparse graph matching framework where each salient feature selects among a
small subset of nearest neighbors in the target image. To
improve our method in the unconstrained setting without
bounding box annotations we introduce novel object proposal based matching constraints. Furthermore, we show
that the sparse matching can be transformed into a dense
correspondence field. Extensive experimental evaluations
on benchmark datasets show that our method significantly
outperforms existing semantic matching methods.