Abstract This paper studies the problem of learning image semantic segmentation networks only using image-level labels as supervision, which is important since it can signifificantly reduce human annotation efforts. Recent state-of-the-art methods on this problem fifirst infer the sparse and discriminative regions for each object class using a deep classifification network, then train semantic a segmentation network using the discriminative regions as supervision. Inspired by the traditional image segmentation methods of seeded region growing, we propose to train a semantic segmentation network starting from the discriminative regions and progressively increase the pixel-level supervision using by seeded region growing. The seeded region growing module is integrated in a deep segmentation network and can bene- fifit from deep features. Different from conventional deep networks which have fifixed/static labels, the proposed weaklysupervised network generates new labels using the contextual information within an image. The proposed method signifificantly outperforms the weakly-supervised semantic segmentation methods using static labels, and obtains the state-of-the-art performance, which are 63.2% mIoU score on the PASCAL VOC 2012 test set and 26.0% mIoU score on the COCO dataset