Abstract
Do visual tasks have relationships, or are they unrelated? For instance, could having surface normals simplify estimating the depth of an image?
Intuition answers these questions positively, implying existence of a certain structure among visual
tasks. Understanding this structure has notable values: it provides a principled way for identifying
relationships across tasks, for instance, in order to
reuse supervision among redundant tasks or solve
many tasks in one system without piling up the
complexity.
We propose a fully computational approach for
identifying the transfer learning structure of the
space of visual tasks. This is done via computing
the transfer learning dependencies across tasks in
a dictionary of twenty-six 2D, 2.5D, 3D, and semantic tasks. The product is a computational taxonomic map among tasks for transfer learning, and
we exploit it to reduce the demand for labeled data.
For example, we show that the total number of labeled datapoints needed for solving a set of 10 tasks
can be reduced by roughly 23
(compared to training independently) while keeping the performance
nearly the same. We provide a set of tools for computing and visualizing this taxonomical structure at
http://taskonomy.vision