End-to-End Joint Semantic Segmentation of Actors and Actions in Video

资源分类

2019-10-28 |

46 |

28 |

Abstract. Traditional video understanding tasks include human action recognition and actor/object semantic segmentation. However, the combined task of providing semantic segmentation for difffferent actor classes simultaneously with their action class remains a challenging but necessary task for many applications. In this work, we propose a new end-to-end architecture for tackling this task in videos. Our model effffectively leverages multiple input modalities, contextual information, and multitask learning in the video to directly output semantic segmentations in a single unifified framework. We train and benchmark our model on the Actor-Action Dataset (A2D) for joint actor-action semantic segmentation, and demonstrate state-of-the-art performance for both segmentation and detection. We also perform experiments verifying our approach improves performance for zero-shot recognition, indicating generalizability of our jointly learned feature space

上一篇：Efficient Semantic Scene Completion Network with Spatial Group Convolution

下一篇：Textual Explanations for Self-Driving Vehicles

用户评价

全部评价

还没有评论，说两句吧！

热门资源

Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
The Variational S...

Unlike traditional images which do not offer in...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Rating-Boosted La...

The performance of a recommendation system reli...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com