Abstract
In this paper, we introduce a novel approach to regulate receptive field in deep image parsing network automatically. Unlike previous works which have stressed much
importance on obtaining better receptive fields using manually selected dilated convolutional kernels, our approach
uses two affine transformation layers in the network’s backbone and operates on feature maps. Feature maps will be
inflated/shrinked by the new layer and therefore receptive
fields in following layers are changed accordingly. By endto-end training, the whole framework is data-driven without laborious manual intervention. The proposed method is
generic across dataset and different tasks. We conduct extensive experiments on both general image parsing task and
face parsing task as concrete examples to demonstrate the
method’s superior regulation ability over manual designs