Abstract
Appearance features have been widely used in video
anomaly detection even though they contain complex entangled factors. We propose a new method to model the normal patterns of human movements in surveillance video for
anomaly detection using dynamic skeleton features. We decompose the skeletal movements into two sub-components:
global body movement and local body posture. We model
the dynamics and interaction of the coupled features in our
novel Message-Passing Encoder-Decoder Recurrent Network. We observed that the decoupled features collaboratively interact in our spatio-temporal model to accurately
identify human-related irregular events from surveillance
video sequences. Compared to traditional appearancebased models, our method achieves superior outlier detection performance. Our model also offers “open-box” examination and decision explanation made possible by the
semantically understandable features and a network architecture supporting interpretability.