Abstract
Learning image similarity metrics in an end-to-end fashion with deep networks has demonstrated excellent results
on tasks such as clustering and retrieval. However, current
methods, all focus on a very local view of the data. In this
paper, we propose a new metric learning scheme, based on
structured prediction, that is aware of the global structure
of the embedding space, and which is designed to optimize
a clustering quality metric (NMI). We show state of the art
performance on standard datasets, such as CUB200-2011
[37], Cars196 [18], and Stanford online products [30] on
NMI and R@K evaluation metrics