Abstract
In this paper, we introduce the new ideas of augmenting Convolutional Neural Networks (CNNs) with Memory
and learning to learn the network parameters for the unlabelled images on the fly in one-shot learning. Specifically,
we present Memory Matching Networks (MM-Net) — a novel deep architecture that explores the training procedure,
following the philosophy that training and test conditions
must match. Technically, MM-Net writes the features of a
set of labelled images (support set) into memory and reads from memory when performing inference to holistically
leverage the knowledge in the set. Meanwhile, a Contextual
Learner employs the memory slots in a sequential manner to
predict the parameters of CNNs for unlabelled images. The
whole architecture is trained by once showing only a few
examples per class and switching the learning from minibatch to minibatch, which is tailored for one-shot learning
when presented with a few examples of new categories at
test time. Unlike the conventional one-shot learning approaches, our MM-Net could output one unified model irrespective of the number of shots and categories. Extensive
experiments are conducted on two public datasets, i.e., Omniglot and miniImageNet, and superior results are reported
when compared to state-of-the-art approaches. More remarkably, our MM-Net improves one-shot accuracy on Omniglot from 98.95% to 99.28% and from 49.21% to 53.37%
on miniImageNet