Abstract. Semantic segmentation of outdoor scenes is problematic when
there are variations in imaging conditions. It is known that albedo (re-
flectance) is invariant to all kinds of illumination effects. Thus, using
reflectance images for semantic segmentation task can be favorable. Additionally, not only segmentation may benefit from reflectance, but also
segmentation may be useful for reflectance computation. Therefore, in
this paper, the tasks of semantic segmentation and intrinsic image decomposition are considered as a combined process by exploring their
mutual relationship in a joint fashion. To that end, we propose a supervised end-to-end CNN architecture to jointly learn intrinsic image
decomposition and semantic segmentation. We analyze the gains of addressing those two problems jointly. Moreover, new cascade CNN architectures for intrinsic-for-segmentation and segmentation-for-intrinsic
are proposed as single tasks. Furthermore, a dataset of 35K synthetic
images of natural environments is created with corresponding albedo
and shading (intrinsics), as well as semantic labels (segmentation) assigned to each object/scene. The experiments show that joint learning
of intrinsic image decomposition and semantic segmentation is beneficial
for both tasks for natural scenes. Dataset and models are available at:
https://ivi.fnwi.uva.nl/cv/intrinseg