Abstract
In this paper, we propose a novel generative model
named Stacked Generative Adversarial Networks (SGAN),
which is trained to invert the hierarchical representations
of a bottom-up discriminative network. Our model consists of a top-down stack of GANs, each learned to generate
lower-level representations conditioned on higher-level representations. A representation discriminator is introduced
at each feature hierarchy to encourage the representation
manifold of the generator to align with that of the bottomup discriminative network, leveraging the powerful discriminative representations to guide the generative model. In
addition, we introduce a conditional loss that encourages
the use of conditional information from the layer above,
and a novel entropy loss that maximizes a variational lower
bound on the conditional entropy of generator outputs. We
first train each stack independently, and then train the whole
model end-to-end. Unlike the original GAN that uses a single noise vector to represent all the variations, our SGAN
decomposes variations into multiple levels and gradually
resolves uncertainties in the top-down generative process.
Based on visual inspection, Inception scores and visual Turing test, we demonstrate that SGAN is able to generate images of much higher quality than GANs without stacking