Predicting Motivations of Actions by Leveraging Text

资源分类

2019-12-27 |

78 |

38 |

Abstract

Understanding human actions is a key problem in com-puter vision. However, recognizing actions is only the firststep of understanding what a person is doing. In this pa-per, we introduce the problem of predicting why a personhas performed an action in images. This problem has manyapplications in human activity understanding, such as an-ticipating or explaining an action. To study this problem, we introduce a new dataset of people performing actions anno-tated with likely motivations. However, the information inan image alone may not be sufficient to automatically solvethis task. Since humans can rely on their lifetime of expe-riences to infer motivation, we propose to give computervision systems access to some of these experiences by usingrecently developed natural language models to mine knowledge stored in massive amounts of text. While we are stillfar away from fully understanding motivation, our results suggest that transferring knowledge from language into vision can help machines understand why people in images might be performing an action.

上一篇：Context Encoders: Feature Learning by Inpainting

下一篇：Occlusion Boundary Detection via Deep Exploration of Context

用户评价

全部评价

还没有评论，说两句吧！

热门资源

The Variational S...

Unlike traditional images which do not offer in...
Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Joint Pose and Ex...

Facial expression recognition (FER) is a challe...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com