Abstract
Despite the promising performance of conventional fully supervised algorithms, semantic segmentation has remained an important, yet challenging task. Due to the limited availability of complete annotations, it is of great interest to design solutions for semantic segmentation that take into account weakly labeled data, which is readily available at a much larger scale. Contrasting the common theme to develop a different algorithm for each type of weak annotation, in this work, we propose a unifified approach that incorporates various forms of weak supervision – image level tags, bounding boxes, and partial labels – to produce a pixel-wise labeling. We conduct a rigorous evaluation on the challenging Siftflflow dataset for various weakly labeled settings, and show that our approach outperforms the stateof-the-art by 12% on per-class accuracy, while maintaining comparable per-pixel accuracy