Abstract
We consider the problem of data augmentation, i.e., generating artificial samples to extend a given corpus of training data. Specifically, we propose attributed-guided augmentation (AGA) which learns a mapping that allows synthesis of data such that an attribute of a synthesized sample
is at a desired value or strength. This is particularly interesting in situations where little data with no attribute annotation is available for learning, but we have access to an
external corpus of heavily annotated samples. While prior
works primarily augment in the space of images, we propose to perform augmentation in feature space instead. We
implement our approach as a deep encoder-decoder architecture that learns the synthesis function in an end-to-end
manner. We demonstrate the utility of our approach on the
problems of (1) one-shot object recognition in a transferlearning setting where we have no prior knowledge of the
new classes, as well as (2) object-based one-shot scene
recognition. As external data, we leverage 3D depth and
pose information from the SUN RGB-D dataset. Our experiments show that attribute-guided augmentation of highlevel CNN features considerably improves one-shot recognition performance on both problems.