DA-GAN: Instance-level Image Translation by Deep Attention Generative
Adversarial Networks
Abstract
Unsupervised image translation, which aims in translating two independent sets of images, is challenging in discovering the correct correspondences without paired data. Existing works build upon Generative Adversarial Networks
(GANs) such that the distribution of the translated images are indistinguishable from the distribution of the target set. However, such set-level constraints cannot learn
the instance-level correspondences (e.g. aligned semantic
parts in object transfiguration task). This limitation often
results in false positives (e.g. geometric or semantic artifacts), and further leads to mode collapse problem. To
address the above issues, we propose a novel framework
for instance-level image translation by Deep Attention GAN
(DA-GAN). Such a design enables DA-GAN to decompose
the task of translating samples from two sets into translating instances in a highly-structured latent space. Specifically, we jointly learn a deep attention encoder, and the
instance-level correspondences could be consequently discovered through attending on the learned instances. Therefore, the constraints could be exploited on both set-level and
instance-level. Comparisons against several state-of-thearts demonstrate the superiority of our approach, and the
broad application capability, e.g, pose morphing, data augmentation, etc., pushes the margin of domain translation
problem