An Attention Enhanced Graph Convolutional LSTM Network forSkeleton-Based Action Recognition
Abstract
Skeleton-based action recognition is an important task
that requires the adequate understanding of movement characteristics of a human action from the given skeleton sequence. Recent studies have shown that exploring
spatial and temporal features of the skeleton sequence is
vital for this task. Nevertheless, how to effectively extract
discriminative spatial and temporal features is still a challenging problem. In this paper, we propose a novel Attention Enhanced Graph Convolutional LSTM Network (AGCLSTM) for human action recognition from skeleton data.
The proposed AGC-LSTM can not only capture discriminative features in spatial configuration and temporal dynamics but also explore the co-occurrence relationship between
spatial and temporal domains. We also present a temporal hierarchical architecture to increase temporal receptive
fields of the top AGC-LSTM layer, which boosts the ability
to learn the high-level semantic representation and signifi-
cantly reduces the computation cost. Furthermore, to select
discriminative spatial information, the attention mechanism is employed to enhance information of key joints in each
AGC-LSTM layer. Experimental results on two datasets are
provided: NTU RGB+D dataset and Northwestern-UCLA
dataset. The comparison results demonstrate the effectiveness of our approach and show that our approach outperforms the state-of-the-art methods on both datasets.