Mutually Reinforced Spatio-Temporal Convolutional Tube for Human Action Recognition

资源分类

2019-10-08 |

140 |

128 |

Abstract Recent works use 3D convolutional neural networks to explore spatio-temporal information for human action recognition. However, they either ignore the correlation between spatial and temporal features or suffer from high computational cost by spatio-temporal features extraction. In this work, we propose a novel and effificient Mutually Reinforced Spatio-Temporal Convolutional Tube (MRST) for human action recognition. It decomposes 3D inputs into spatial and temporal representations, mutually enhances both of them by exploiting the interaction of spatial and temporal information and selectively emphasizes informative spatial appearance and temporal motion, meanwhile reducing the complexity of structure. Moreover, we design three types of MRSTs according to the different order of spatial and temporal information enhancement, each of which contains a spatio-temporal decomposition unit, a mutually reinforced unit and a spatio-temporal fusion unit. An end-to-end deep network, MRST-Net, is also proposed based on the MRSTs to better explore spatiotemporal information in human actions. Extensive experiments show MRST-Net yields the best performance, compared to state-of-the-art approaches.

上一篇：Video Interactive Captioning with Human Prompts

下一篇：Multi-Robot Planning Under Uncertain Travel Times and Safety Constraints

用户评价

全部评价

还没有评论，说两句吧！

热门资源

Deep Cross-media ...

Cross-media retrieval is a research hotspot in ...
Regularizing RNNs...

Recently, caption generation with an encoder-de...
The Variational S...

Unlike traditional images which do not offer in...
Supervised Descen...

Many computer vision problems (e.
Visual Reinforcem...

For an autonomous agent to fulfill a wide range...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com