Abstract
This paper concerns action recognition from unseenand unknown views. We propose unsupervised learningof a non-linear model that transfers knowledge from mul-tiple views to a canonical view. The proposed Non-linearKnowledge Transfer Model (NKTM) is a deep network, withweight decay and sparsity constraints, which finds a sharedhigh-level virtual path from videos captured from different unknown viewpoints to the same canonical view. The strength of our technique is that we learn a single NKTM for all actions and all camera viewing directions. Thus, NKTM does not require action labels during learning and knowledge of the camera viewpoints during training or testing. NKTM is learned once only from dense trajectories of syn-thetic points fitted to mocap data and then applied to real video data. Trajectories are coded with a general codebook learned from the same mocap data. NKTM is scalable to new action classes and training data as it does not require re-learning. Experiments on the IXMAS and N-UCLA datasets show that NKTM outperforms existing state-of-theart methods for cross-view action recognition.