Abstract
This paper presents a method for single target tracking of arbitrary objects in challenging video sequences. Targets are modeled at three different levels of granularity (pixel level, parts-based level and bounding box level), which are cross-constrained to enable robust model relearning. The main contribution is an adaptive clustered decision tree method which dynamically selects the minimum combination of features necessary to suffificiently represent each target part at each frame, thereby providing robustness with computational effificiency. The adaptive clustered decision tree is implemented in two separate parts of the tracking algorithm: fifirstly to enable robust matching at the partsbased level between successive frames; and secondly to select the best superpixels for learning new parts of the target. We have tested the tracker using two different tracking benchmarks (VOT2013-2014 and CVPR2013 tracking challenges), based on two different test methodologies, and show it to be signifificantly more robust than the best stateof-the-art methods from both of those tracking challenges, while also offering competitive tracking precision.