Abstract
A novel dataset for benchmarking image-based localization is presented. With increasing research interests in
visual place recognition and localization, several datasets
have been published in the past few years. One of the evident limitations of existing datasets is that precise ground
truth camera poses of query images are not available in a
meaningful 3D metric system. This is in part due to the underlying 3D models of these datasets are reconstructed from
Structure from Motion methods. So far little attention has
been paid to metric evaluations of localization accuracy. In
this paper we address the problem of whether state-of-theart visual localization techniques can be applied to tasks
with demanding accuracy requirements. We acquired training data for a large indoor environment with cameras and a
LiDAR scanner. In addition, we collected over 2000 query
images with cell phone cameras. Using LiDAR point clouds
as a reference, we employed a semi-automatic approach to
estimate the 6 degrees of freedom camera poses precisely
in the world coordinate system. The proposed dataset enables us to quantitatively assess the performance of various
algorithms using a fair and intuitive metric