Abstract
This paper presents a novel pixelwise representation for vi- sual tracking that models both the spatial structure and dynamics of a target in a unified fashion. The representation is derived from spatiotem- poral energy measurements that capture underlying local spacetime ori- entation structure at multiple scales. For interframe motion estimation, the feature representation is instantiated within a pixelwise template warping framework; thus, the spatial arrangement of the pixelwise en- ergy measurements remains intact. The proposed target representation is extremely rich, including appearance and motion information as well as information about how these descriptors are spatially arranged. Qual- itative and quantitative empirical evaluation on challenging sequences demonstrates that the resulting tracker outperforms several alternative state-of-the-art systems.