What do 15,000 object categories tell us about classifying and localizing actions?

资源分类

2019-12-17 |

71 |

34 |

Abstract

This paper contributes to automatic classifification and localization of human actions in video. Whereas motion is the key ingredient in modern approaches, we assess the benefifits of having objects in the video representation. Rather than considering a handful of carefully selected and localized objects, we conduct an empirical study on the bene- fifit of encoding 15,000 object categories for action using 6 datasets totaling more than 200 hours of video and covering 180 action classes. Our key contributions are i) the fifirst in-depth study of encoding objects for actions, ii) we show that objects matter for actions, and are often semantically relevant as well. iii) We establish that actions have object preferences. Rather than using all objects, selection is advantageous for action recognition. iv) We reveal that objectaction relations are generic, which allows to transferring these relationships from the one domain to the other. And, v) objects, when combined with motion, improve the stateof-the-art for both action classifification and localization

上一篇：Fast Bilateral-Space Stereo for Synthetic Defocus

下一篇：Membership Representation for Detecting Block-diagonal Structure in Low-rank or Sparse Subspace Clustering

用户评价

全部评价

还没有评论，说两句吧！

热门资源

The Variational S...

Unlike traditional images which do not offer in...
Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
Learning to learn...

The move from hand-designed features to learned...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com