资源论文Beyond Short Snippets: Deep Networks for Video Classification

Beyond Short Snippets: Deep Networks for Video Classification

2019-12-25 | |  97 |   85 |   0

Abstract

Convolutional neural networks (CNNs) have been exten-sively applied for image recognition problems giving state-of-the-art results on recognition, detection, segmentationand retrieval. In this work we propose and evaluate severaldeep neural network architectures to combine image infor-mation across a video over longer time periods than previ-ously attempted. We propose two methods capable of han-dling full length videos. The first method explores variousconvolutional temporal feature pooling architectures, ex-amining the various design choices which need to be madewhen adapting a CNN for this task. The second proposedmethod explicitly models the video as an ordered sequence of frames. For this purpose we employ a recurrent neuralnetwork that uses Long Short-Term Memory (LSTM) cellswhich are connected to the output of the underlying CNN.Our best networks exhibit significant performance improvements over previously published results on the Sports 1 million dataset (73.1% vs. 60.9%) and the UCF-101 datasets with (88.6% vs. 88.0%) and without additional optical flow information (82.6% vs. 73.0%).

上一篇:Low-level Vision by Consensus in a Spatial Hierarchy of Regions

下一篇:Joint Inference of Groups, Events and Human Roles in Aerial Videos

用户评价
全部评价

热门资源

  • Regularizing RNNs...

    Recently, caption generation with an encoder-de...

  • Deep Cross-media ...

    Cross-media retrieval is a research hotspot in ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Supervised Descen...

    Many computer vision problems (e.

  • Learning Expressi...

    Facial expression is temporally dynamic event w...