Abstract We propose FineGAN, a novel unsupervised GAN framework, which disentangles the background, object shape, and object appearance to hierarchically generate images of fifine-grained object categories. To disentangle the factors without supervision, our key idea is to use information theory to associate each factor to a latent code, and to condition the relationships between the codes in a specifific way to induce the desired hierarchy. Through extensive experiments, we show that FineGAN achieves the desired disentanglement to generate realistic and diverse images belonging to fifine-grained classes of birds, dogs, and cars. Using FineGAN’s automatically learned features, we also cluster real images as a fifirst attempt at solving the novel problem of unsupervised fifine-grained object category discovery. Our code/models/demo can be found at https://github.com/kkanshul/finegan