Abstract
Convolutions are the fundamental building blocks of
CNNs. The fact that their weights are spatially shared is
one of the main reasons for their widespread use, but it is
also a major limitation, as it makes convolutions contentagnostic. We propose a pixel-adaptive convolution (PAC)
operation, a simple yet effective modification of standard
convolutions, in which the filter weights are multiplied with
a spatially varying kernel that depends on learnable, local pixel features. PAC is a generalization of several popular filtering techniques and thus can be used for a wide
range of use cases. Specifically, we demonstrate state-ofthe-art performance when PAC is used for deep joint image upsampling. PAC also offers an effective alternative to
fully-connected CRF (Full-CRF), called PAC-CRF, which
performs competitively compared to Full-CRF, while being
considerably faster. In addition, we also demonstrate that
PAC can be used as a drop-in replacement for convolution
layers in pre-trained networks, resulting in consistent performance improvements.