Abstract
We construct custom regularization functions for use in
supervised training of deep neural networks. Our technique
is applicable when the ground-truth labels themselves exhibit internal structure; we derive a regularizer by learning an autoencoder over the set of annotations. Training
thereby becomes a two-phase procedure. The first phase
models labels with an autoencoder. The second phase trains
the actual network of interest by attaching an auxiliary
branch that must predict output via a hidden layer of the autoencoder. After training, we discard this auxiliary branch.
We experiment in the context of semantic segmentation,
demonstrating this regularization strategy leads to consistent accuracy boosts over baselines, both when training
from scratch, or in combination with ImageNet pretraining.
Gains are also consistent over different choices of convolutional network architecture. As our regularizer is discarded
after training, our method has zero cost at test time; the performance improvements are essentially free. We are simply
able to learn better network weights by building an abstract
model of the label space, and then training the network to
understand this abstraction alongside the original task