Abstract
Localizing a query image against a 3D model at large
scale is a hard problem, since 2D-3D matches become more
and more ambiguous as the model size increases. This creates a need for pose estimation strategies that can handle very low inlier ratios. In this paper, we draw new insights on the geometric information available from the 2D-
3D matching process. As modern descriptors are not invariant against large variations in viewpoint, we are able to find
the rays in space used to triangulate a given point that are
closest to a query descriptor. It is well known that two correspondences constrain the camera to lie on the surface of
a torus. Adding the knowledge of direction of triangulation,
we are able to approximate the position of the camera from
two matches alone. We derive a geometric solver1
that can
compute this position in under 1 microsecond. Using this
solver, we propose a simple yet powerful outlier filter which
scales quadratically in the number of matches. We validate
the accuracy of our solver and demonstrate the usefulness
of our method in real world settings.