资源论文Parsing videos of actions with segmental grammars

Parsing videos of actions with segmental grammars

2019-12-16 | |  44 |   34 |   0

Abstract

Real-world videos of human activities exhibit temporal structure at various scales; long videos are typically composed out of multiple action instances, where each instance is itself composed of sub-actions with variable durations and orderings. Temporal grammars can presumably model such hierarchical structure, but are computationally difficult to apply for long video streams. We describe simple grammars that capture hierarchical temporal structure while admitting inference with a finite-state-machine. This makes parsing linear time, constant storage, and naturally online. We train grammar parameters using a latent struc-tural SVM, where latent subactions are learned automatically. We illustrate the effectiveness of our approach overcommon baselines on a new half-million frame dataset of continuous YouTube videos.

上一篇:Incremental Activity Modeling and Recognition in Streaming Videos

下一篇:A Hierarchical Context Model for Event Recognition in Surveillance Video

用户评价
全部评价

热门资源

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...