Abstract
An increasing number of well-trained deep networks have been released online by researchers and
developers, enabling the community to reuse them
in a plug-and-play way without accessing the training annotations. However, due to the large number
of network variants, such public-available trained
models are often of different architectures, each of
which being tailored for a specific task or dataset.
In this paper, we study a deep-model reusing task,
where we are given as input pre-trained networks
of heterogeneous architectures specializing in distinct tasks, as teacher models. We aim to learn a
multitalented and light-weight student model that
is able to grasp the integrated knowledge from all
such heterogeneous-structure teachers, again without accessing any human annotation. To this end,
we propose a common feature learning scheme, in
which the features of all teachers are transformed
into a common space and the student is enforced
to imitate them all so as to amalgamate the intact knowledge. We test the proposed approach on a
list of benchmarks and demonstrate that the learned
student is able to achieve very promising performance, superior to those of the teachers in their specialized tasks