Regularizing Long Short Term Memory with 3D Human-Skeleton Sequences for Action Recognition

资源分类

2019-12-23 |

47 |

41 |

Abstract

This paper argues that large-scale action recognition in video can be greatly improved by providing an additional modality in training data – namely, 3D human-skeleton sequences – aimed at complementing poorly represented or missing features of human actions in the training videos. For recognition, we use Long Short Term Memory (LSTM) grounded via a deep Convolutional Neural Network (CNN) onto the video. Training of LSTM is regularized using the output of another encoder LSTM (eLSTM) grounded on 3D human-skeleton training data. For such regularized training of LSTM, we modify the standard backpropagation through time (BPTT) in order to address the wellknown issues with gradient descent in constraint optimization. Our evaluation on three benchmark datasets – Sports- 1M, HMDB-51, and UCF101 – shows accuracy improvements from 1.7% up to 14.8% relative to the state of the art.

上一篇：Scene recognition with CNNs: objects, scales and dataset bias

下一篇：Efficiently Creating 3D Training Data for Fine Hand Pose Estimation

用户评价

全部评价

还没有评论，说两句吧！

热门资源

Stratified Strate...

In this paper we introduce Stratified Strategy ...
The Variational S...

Unlike traditional images which do not offer in...
Learning to Predi...

Much of model-based reinforcement learning invo...
Learning to learn...

The move from hand-designed features to learned...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com