Abstract
Estimating the relative rigid pose between two RGB-D
scans of the same underlying environment is a fundamental
problem in computer vision, robotics, and computer graphics. Most existing approaches allow only limited relative
pose changes since they require considerable overlap between the input scans. We introduce a novel approach that
extends the scope to extreme relative poses, with little or
even no overlap between the input scans. The key idea is to
infer more complete scene information about the underlying environment and match on the completed scans. In particular, instead of only performing scene completion from
each individual scan, our approach alternates between relative pose estimation and scene completion. This allows us
to perform scene completion by utilizing information from
both input scans at late iterations, resulting in better results
for both scene completion and relative pose estimation. Experimental results on benchmark datasets show that our approach leads to considerable improvements over state-ofthe-art approaches for relative pose estimation. In particular, our approach provides encouraging relative pose estimates even between non-overlapping scans.