资源论文Detecting events and key actors in multi-person videos

Detecting events and key actors in multi-person videos

2019-12-26 | |  127 |   96 |   0

Abstract

Multi-person event recognition is a challenging task, of-ten with many people active in the scene but only a smallsubset contributing to an actual event. In this paper, wepropose a model which learns to detect events in such videoswhile automatically “attending” to the people responsible for the event. Our model does not use explicit annotationsregarding who or where those people are during trainingand testing. In particular, we track people in videos anduse a recurrent neural network (RNN) to represent the trackfeatures. We learn time-varying attention weights to com-bine these features at each time-instant. The attended fea-tures are then processed using another RNN for event de-tection/classification. Since most video datasets with mul-tiple people are restricted to a small number of videos, we also collected a new basketball dataset comprising 257 basketball games with 14K event annotations corresponding to 11 event classes. Our model outperforms state-of-the-art methods for both event classification and detection on this new dataset. Additionally, we show that the attention mechanism is able to consistently localize the relevant players.

上一篇:Automatic Fence Segmentation in Videos of Dynamic Scenes

下一篇:Automating Carotid Intima-Media Thickness Video Interpretation with Convolutional Neural Networks

用户评价
全部评价

热门资源

  • Deep Cross-media ...

    Cross-media retrieval is a research hotspot in ...

  • Regularizing RNNs...

    Recently, caption generation with an encoder-de...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning Expressi...

    Facial expression is temporally dynamic event w...

  • Visual Reinforcem...

    For an autonomous agent to fulfill a wide range...