Abstract
Due to the limited amount of training samples, fine-tuning pre-trained deep models online is prone to over-fitting. In this paper, we propose a sequential trainingmethod for convolutional neural networks (CNNs) to effec-tively transfer pre-trained deep features for online applica-tions. We regard a CNN as an ensemble with each chan-nel of the output feature map as an individual base learner.Each base learner is trained using different loss criterionsto reduce correlation and avoid over-training. To achievethe best ensemble online, all the base learners are sequen-tially sampled into the ensemble via important sampling. Tofurther improve the robustness of each base learner, we pro-pose to train the convolutional layers with random binarymasks, which serves as a regularization to enforce each base learner to focus on different input features. The proposed online training method is applied to visual tracking problem by transferring deep features trained on massive annotated visual data and is shown to signif-icantly improve tracking performance. Extensive experiments are conducted on two challenging benchmark data set and demonstrate that our tracking algorithm can outperform state-of-the-art methods with a considerable margin.