Abstract
Human vision is able to immediately recognize novel visual categories after seeing just one or a few training examples. We describe how to add a similar capability to
ConvNet classifiers by directly setting the final layer weights
from novel training examples during low-shot learning. We
call this process weight imprinting as it directly sets weights
for a new category based on an appropriately scaled copy of
the embedding layer activations for that training example.
The imprinting process provides a valuable complement to
training with stochastic gradient descent, as it provides immediate good classification performance and an initialization for any further fine-tuning in the future. We show how
this imprinting process is related to proxy-based embeddings. However, it differs in that only a single imprinted
weight vector is learned for each novel category, rather
than relying on a nearest-neighbor distance to training instances as typically used with embedding methods. Our experiments show that using averaging of imprinted weights
provides better generalization than using nearest-neighbor
instance embeddings