Abstract
Gestures are a common form of human communication
and important for human computer interfaces (HCI). Recent approaches to gesture recognition use deep learning
methods, including multi-channel methods. We show that
when spatial channels are focused on the hands, gesture
recognition improves significantly, particularly when the
channels are fused using a sparse network. Using this technique, we improve performance on the ChaLearn IsoGD
dataset from a previous best of 67.71% to 82.07%, and on
the NVIDIA dataset from 83.8% to 91.28%