Abstract
Semantic concept hierarchy is still under-explored for semantic segmentation due to the inefficiency and complicated
optimization of incorporating structural inference into dense
prediction. This lack of modeling semantic correlations also
makes prior works must tune highly-specified models for
each task due to the label discrepancy across datasets. It
severely limits the generalization capability of segmentation
models for open set concept vocabulary and annotation utilization. In this paper, we propose a Dynamic-Structured
Semantic Propagation Network (DSSPN) that builds a semantic neuron graph by explicitly incorporating the semantic
concept hierarchy into network construction. Each neuron
represents the instantiated module for recognizing a specific
type of entity such as a super-class (e.g. food) or a specific
concept (e.g. pizza). During training, DSSPN performs the
dynamic-structured neuron computation graph by only activating a sub-graph of neurons for each image in a principled
way. A dense semantic-enhanced neural block is proposed
to propagate the learned knowledge of all ancestor neurons
into each fine-grained child neuron for feature evolving. Another merit of such semantic explainable structure is the
ability of learning a unified model concurrently on diverse
datasets by selectively activating different neuron sub-graphs
for each annotation at each step. Extensive experiments on
four public semantic segmentation datasets (i.e. ADE20K,
COCO-Stuff, Cityscape and Mapillary) demonstrate the superiority of our DSSPN over state-of-the-art segmentation
models. Moreoever, we demonstrate a universal segmentation model that is jointly trained on diverse datasets can
surpass the performance of the common fine-tuning scheme
for exploiting multiple domain knowledge