Abstract
We present a novel approach for automatically learning mod- els of temporal tra jectories extracted from video data. Instead of using a representation of linearly time-normalised vectors of fixed-length, our approach makes use of Dynamic Time Warp distance as a similarity measure to capture the underlying ordered structure of variable-length temporal data while removing the non-linear warping of the time scale. We reformulate the structure learning problem as an optimal graph- partitioning of the dataset to solely exploit Dynamic Time Warp simi- larity weights without the need for intermediate cluster centroid repre- sentations. We extend the graph partitioning method and in particular, the Normalised Cut model originally introduced for static image seg- mentation to unsupervised clustering of temporal tra jectories with fully automated model order selection. By computing hierarchical average Dy- namic Time Warp for each cluster, we learn warp-free tra jectory models and recover the time warp profiles and structural variance in the data. We demonstrate the approach on modelling tra jectories of continuous hand-gestures and moving ob jects in an indoor environment.