Efficient Action Localization with Approximately Normalized Fisher Vectors

资源分类

2019-12-11 |

103 |

64 |

Abstract

The Fisher vector (FV) representation is a highdimensional extension of the popular bag-of-word representation. Transformation of the FV by power and ℓ2 normalizations has shown to signifificantly improve its performance, and led to state-of-the-art results for a range of image and video classifification and retrieval tasks. These normalizations, however, render the representation non-additive over local descriptors. Combined with its high dimensionality, this makes the FV computationally expensive for the purpose of localization tasks. In this paper we present approximations to both these normalizations, which yield signififi- cant improvements in the memory and computational costs of the FV when used for localization. Second, we show how these approximations can be used to defifine upper-bounds on the score function that can be effificiently evaluated, which enables the use of branch-and-bound search as an alternative to exhaustive sliding window search. We present experimental evaluation results on classifification and temporal localization of actions in videos. These show that the our approximations lead to a speedup of at least one order of magnitude, while maintaining state-of-the-art action recognition and localization performance.

上一篇：Additive Quantization for Extreme Vector Compression

下一篇：Good Vibrations: A Modal Analysis Approach for Sequential Non-Rigid Structure from Motion

用户评价

全部评价

还没有评论，说两句吧！

热门资源

A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Learning to Predi...

Much of model-based reinforcement learning invo...
Hierarchical Task...

We extend hierarchical task network planning wi...
The Variational S...

Unlike traditional images which do not offer in...
Shape-based Autom...

We present an algorithm for automatic detection...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com