Abstract
Cross-modal retrieval has attracted intensive attention
in recent years.Measuring the semantic similarity between
heterogeneous data objects is an essential yet challenging
problem in cross-modal retrieval.In this paper,we propose
an online learning method to learn the similarity function
between heterogeneous modalities by preserving the rela-
tive similarity in the training data,which is modeled as
a set of bi-directional hinge loss constraints on the cross-
modal training triplets.The overall online similarity func-
tion learning problem is optimized by the margin based
Passive-Aggressive algorithm.We further extend the ap-
proach to learn similarity function in reproducing kernel
Hilbert spaces by kernelizing the approach and combining
multiple kernels derived from different layers of the CNN
features using the Hedging algorithm.Theoretical mistake
bounds are given for our methods.Experiments conducted
on real world datasets well demonstrate the effectiveness of
our methods.