Abstract
We present an attention-based model that reasons on hu-man body shape and motion dynamics to identify individu-als in the absence of RGB information, hence in the dark.Our approach leverages unique 4D spatio-temporal sig-natures to address the identification problem across days.Formulated as a reinforcement learning task, our model isbased on a combination of convolutional and recurrent neu-ral networks with the goal of identifying small, discrimina-tive regions indicative of human identity. We demonstratethat our model produces state-of-the-art results on severalpublished datasets given only depth images. We furtherstudy the robustness of our model towards viewpoint, appearance, and volumetric changes. Finally, we share in-sights gleaned from interpretable 2D, 3D, and 4D visualizations of our model’s spatio-temporal attention.