资源论文Modeling Sub-Event Dynamics in First-Person Action Recognition

Modeling Sub-Event Dynamics in First-Person Action Recognition

2019-12-04 | |  57 |   40 |   0

Abstract

First-person videos have unique characteristics such as heavy egocentric motion, strong preceding events, salient transitional activities and post-event impacts. Action recognition methods designed for third person videos may not optimally represent actions captured by fifirst-person videos. We propose a method to represent the high level dynamics of sub-events in fifirst-person videos by dynamically pooling features of sub-intervals of time series using a temporal feature pooling function. The sub-event dynamics are then temporally aligned to make a new series. To keep track of how the sub-event dynamics evolve over time, we recursively employ the Fast Fourier Transform on a pyramidal temporal structure. The Fourier coeffificients of the segment defifine the overall video representation. We perform experiments on two existing benchmark fifirst-person video datasets which have been captured in a controlled environment. Addressing this gap, we introduce a new dataset collected from YouTube which has a larger number of classes and a greater diversity of capture conditions thereby more closely depicting real-world challenges in fifirst-person video analysis. We compare our method to state-of-the-art fifirst person and generic video recognition algorithms. Our method consistently outperforms the nearest competitors by 10.3%, 3.3% and 11.7% respectively on the three datasets

上一篇:Modeling Relationships in Referential Expressions with Compositional Modular Networks

下一篇:More is Less: A More Complicated Network with Less Inference Complexity

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...