Abstract
We consider the problem of zero-shot recognition: learning a visual classifier for a category with zero training examples, just using the word embedding of the category and
its relationship to other categories, which visual data are
provided. The key to dealing with the unfamiliar or novel
category is to transfer knowledge obtained from familiar
classes to describe the unfamiliar class. In this paper, we
build upon the recently introduced Graph Convolutional
Network (GCN) and propose an approach that uses both
semantic embeddings and the categorical relationships to
predict the classifiers. Given a learned knowledge graph
(KG), our approach takes as input semantic embeddings for
each node (representing visual category). After a series of
graph convolutions, we predict the visual classifier for each
category. During training, the visual classifiers for a few
categories are given to learn the GCN parameters. At test
time, these filters are used to predict the visual classifiers of
unseen categories. We show that our approach is robust to
noise in the KG. More importantly, our approach provides
significant improvement in performance compared to the current state-of-the-art results (from 2 ? 3% on some metrics
to whopping 20% on a few)