Abstract
Extraction of local feature descriptors is a vital stage in
the solution pipelines for numerous computer vision tasks.
Learning-based approaches improve performance in certain tasks, but still cannot replace handcrafted features in
general. In this paper, we improve the learning of local
feature descriptors by optimizing the performance of descriptor matching, which is a common stage that follows
descriptor extraction in local feature based pipelines, and
can be formulated as nearest neighbor retrieval. Specifically, we directly optimize a ranking-based retrieval performance metric, Average Precision, using deep neural networks. This general-purpose solution can also be viewed
as a listwise learning to rank approach, which is advantageous compared to recent local ranking approaches. On
standard benchmarks, descriptors learned with our formulation achieve state-of-the-art results in patch verification,
patch retrieval, and image matching