Abstract
Matching local image descriptors is a key step in many
computer vision applications. For more than a decade,
hand-crafted descriptors such as SIFT have been used for
this task. Recently, multiple new descriptors learned from
data have been proposed and shown to improve on SIFT in
terms of discriminative power. This paper is dedicated to
an extensive experimental evaluation of learned local features to establish a single evaluation protocol that ensures
comparable results. In terms of matching performance, we
evaluate the different descriptors regarding standard criteria. However, considering matching performance in isolation only provides an incomplete measure of a descriptor’s
quality. For example, finding additional correct matches between similar images does not necessarily lead to a better
performance when trying to match images under extreme
viewpoint or illumination changes. Besides pure descriptor
matching, we thus also evaluate the different descriptors in
the context of image-based reconstruction. This enables us
to study the descriptor performance on a set of more practical criteria including image retrieval, the ability to register
images under strong viewpoint and illumination changes,
and the accuracy and completeness of the reconstructed
cameras and scenes. To facilitate future research, the full
evaluation pipeline is made publicly available