Abstract. The goal of this paper is to estimate the 3D coordinates of
the hand joints from a single depth image. To give consideration to both
the accuracy and the real time performance, we design a novel threebranch Convolutional Neural Networks named Hand Branch Ensemble
network (HBE), where the three branches correspond to the three parts
of a hand: the thumb, the index finger and the other fingers. The structural design inspiration of the HBE network comes from the understanding of the differences in the functional importance of different fingers. In
addition, a feature ensemble layer along with a low-dimensional embedding layer ensures the overall hand shape constraints. The experimental
results on three public datasets demonstrate that our approach achieves
comparable or better performance to state-of-the-art methods with less
training data, shorter training time and faster frame rate