Abstract
We consider the problem of estimating repetition in video,
such as performing push-ups, cutting a melon or playing
violin. Existing work shows good results under the assumption of static and stationary periodicity. As realistic
video is rarely perfectly static and stationary, the often preferred Fourier-based measurements is inapt. Instead, we
adopt the wavelet transform to better handle non-static and
non-stationary video dynamics. From the flow field and its
differentials, we derive three fundamental motion types and
three motion continuities of intrinsic periodicity in 3D. On
top of this, the 2D perception of 3D periodicity considers
two extreme viewpoints. What follows are 18 fundamental
cases of recurrent perception in 2D. In practice, to deal
with the variety of repetitive appearance, our theory implies
measuring time-varying flow Ft and its differentials ?Ft,
? · Ft and ? × Ft over segmented foreground motion. For
experiments, we introduce the new QUVA Repetition dataset,
reflecting reality by including non-static and non-stationary
videos. On the task of counting repetitions in video, we obtain
favorable results compared to a deep learning alternative