Abstract. We present a novel approach for model-based 6D pose re-
finement in color data. Building on the established idea of contour-based
pose tracking, we teach a deep neural network to predict a translational
and rotational update. At the core, we propose a new visual loss that
drives the pose update by aligning object contours, thus avoiding the definition of any explicit appearance model. In contrast to previous work our
method is correspondence-free, segmentation-free, can handle occlusion
and is agnostic to geometrical symmetry as well as visual ambiguities.
Additionally, we observe a strong robustness towards rough initialization. The approach can run in real-time and produces pose accuracies
that come close to 3D ICP without the need for depth data. Furthermore, our networks are trained from purely synthetic data and will be
published together with the refinement code at http://campar.in.tum.
de/Main/FabianManhardt to ensure reproducibility.