Abstract
Person re-identification benefits greatly from deep neural
networks (DNN) to learn accurate similarity metrics and
robust feature embeddings. However, most of the current
methods impose only local constraints for similarity learning. In this paper, we incorporate constraints on large image groups by combining the CRF with deep neural networks. The proposed method aims to learn the “local similarity” metrics for image pairs while taking into account
the dependencies from all the images in a group, forming
“group similarities”. Our method involves multiple images
to model the relationships among the local and global similarities in a unified CRF during training, while combines
multi-scale local similarities as the predicted similarity in
testing. We adopt an approximate inference scheme for estimating the group similarity, enabling end-to-end training.
Extensive experiments demonstrate the effectiveness of our
model that combines DNN and CRF for learning robust
multi-scale local similarities. The overall results outperform those by state-of-the-arts with considerable margins
on three widely-used benchmarks