Abstract
We propose a novel approach for learning features from weakly-supervised data by joint ranking and classification. In order to exploit data with weak labels, we jointly train a feature extraction network with a ranking loss and a classi-fication network with a cross-entropy loss. We obtain highquality compact discriminative features with few parameters, learned on relatively small datasets without additionalannotations. This enables us to tackle tasks with specialized images not very similar to the more generic ones in exist-ing fully-supervised datasets. We show that the resultingfeatures in combination with a linear classifier surpass the state-of-the-art on the Hipster Wars dataset despite usingfeatures only 0.3% of the size. Our proposed features sig-nificantly outperform those obtained from networks trained on ImageNet, despite being 32 times smaller (128 singleprecision floats), trained on noisy and weakly-labeled data, and using only 1.5% of the number of parameters.1 .