Webly Supervised Learning Meets Zero-shot Learning: A Hybrid Approach for
Fine-grained Classification
Abstract
Fine-grained image classification, which targets at distinguishing subtle distinctions among various subordinate
categories, remains a very difficult task due to the high
annotation cost of enormous fine-grained categories. To
cope with the scarcity of well-labeled training images, existing works mainly follow two research directions: 1) utilize
freely available web images without human annotation; 2)
only annotate some fine-grained categories and transfer the
knowledge to other fine-grained categories, which falls into
the scope of zero-shot learning (ZSL). However, the above
two directions have their own drawbacks. For the first direction, the labels of web images are very noisy and the
data distribution between web images and test images are
considerably different. For the second direction, the performance gap between ZSL and traditional supervised learning is still very large. The drawbacks of the above two directions motivate us to design a new framework which can
jointly leverage both web data and auxiliary labeled categories to predict the test categories that are not associated
with any well-labeled training images. Comprehensive experiments on three benchmark datasets demonstrate the effectiveness of our proposed framework