Abstract. Multi-task learning has been widely adopted in many computer vision
tasks to improve overall computation efficiency or boost the performance of individual tasks, under the assumption that those tasks are correlated and complementary to each other. However, the relationships between the tasks are complicated
in practice, especially when the number of involved tasks scales up. When two
tasks are of weak relevance, they may compete or even distract each other during
joint training of shared parameters, and as a consequence undermine the learning
of all the tasks. This will raise destructive interference which decreases learning
efficiency of shared parameters and lead to low quality loss local optimum w.r.t.
shared parameters. To address the this problem, we propose a general modulation
module, which can be inserted into any convolutional neural network architecture, to encourage the coupling and feature sharing of relevant tasks while disentangling the learning of irrelevant tasks with minor parameters addition. Equipped
with this module, gradient directions from different tasks can be enforced to be
consistent for those shared parameters, which benefits multi-task joint training.
The module is end-to-end learnable without ad-hoc design for specific tasks, and
can naturally handle many tasks at the same time. We apply our approach on
two retrieval tasks, face retrieval on the CelebA dataset [12] and product retrieval
on the UT-Zappos50K dataset [34, 35], and demonstrate its advantage over other
multi-task learning methods in both accuracy and storage efficiency