Joint Representation and Truncated Inference
Learning for Correlation Filter based Tracking
Abstract Correlation filter (CF) based trackers generally include two
modules, i.e., feature representation and on-line model adaptation. In
existing off-line deep learning models for CF trackers, the model adaptation usually is either abandoned or has closed-form solution to make it
feasible to learn deep representation in an end-to-end manner. However,
such solutions fail to exploit the advances in CF models, and cannot
achieve competitive accuracy in comparison with the state-of-the-art CF
trackers. In this paper, we investigate the joint learning of deep representation and model adaptation, where an updater network is introduced
for better tracking on future frame by taking current frame representation, tracking result, and last CF tracker as input. By modeling the
representor as convolutional neural network (CNN), we truncate the alternating direction method of multipliers (ADMM) and interpret it as
a deep network of updater, resulting in our model for learning representation and truncated inference (RTINet). Experiments demonstrate
that our RTINet tracker achieves favorable tracking accuracy against the
state-of-the-art trackers and its rapid version can run at a real-time speed
of 24 fps. The code and pre-trained models will be publicly available at
https://github.com/tourmaline612/RTINet