Recurrent Convolutional Neural Networks for Continuous Sign Language Recognition by Staged Optimization

资源分类

2019-12-10 |

43 |

37 |

Abstract This work presents a weakly supervised framework with deep neural networks for vision-based continuous sign language recognition, where the ordered gloss labels but no exact temporal locations are available with the video of sign sentence, and the amount of labeled sentences for training is limited. Our approach addresses the mapping of video segments to glosses by introducing recurrent convolutional neural network for spatio-temporal feature extraction and sequence learning. We design a three-stage optimization process for our architecture. First, we develop an end-toend sequence learning scheme and employ connectionist temporal classification (CTC) as the objective function for alignment proposal. Second, we take the alignment proposal as stronger supervision to tune our feature extractor. Finally, we optimize the sequence learning model with the improved feature representations, and design a weakly supervised detection network for regularization. We apply the proposed approach to a real-world continuous sign language recognition benchmark, and our method, with no extra supervision, achieves results comparable to the stateof-the-art.

上一篇：Real-Time 3D Model Tracking in Color and Depth on a Single CPU Core

下一篇：Regressing Robust and Discriminative 3D Morphable Models with a very Deep Neural Network

用户评价

全部评价

还没有评论，说两句吧！

热门资源

Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
The Variational S...

Unlike traditional images which do not offer in...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Rating-Boosted La...

The performance of a recommendation system reli...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com