Abstract
Visual object tracking has been a fundamental topic in
recent years and many deep learning based trackers have
achieved state-of-the-art performance on multiple benchmarks. However, most of these trackers can hardly get top
performance with real-time speed. In this paper, we propose the Siamese region proposal network (Siamese-RPN)
which is end-to-end trained off-line with large-scale image
pairs. Specifically, it consists of Siamese subnetwork for
feature extraction and region proposal subnetwork including the classification branch and regression branch. In the
inference phase, the proposed framework is formulated as a
local one-shot detection task. We can pre-compute the template branch of the Siamese subnetwork and formulate the
correlation layers as trivial convolution layers to perform
online tracking. Benefit from the proposal refinement, traditional multi-scale test and online fine-tuning can be discarded. The Siamese-RPN runs at 160 FPS while achieving
leading performance in VOT2015, VOT2016 and VOT2017
real-time challenges