Abstract
In this paper we introduce a model of lifelong learning,
based on a Network of Experts. New tasks / experts are
learned and added to the model sequentially, building on
what was learned before. To ensure scalability of this process, data from previous tasks cannot be stored and hence is
not available when learning a new task. A critical issue in
such context, not addressed in the literature so far, relates
to the decision which expert to deploy at test time. We introduce a set of gating autoencoders that learn a representation for the task at hand, and, at test time, automatically
forward the test sample to the relevant expert. This also
brings memory efficiency as only one expert network has to
be loaded into memory at any given time. Further, the autoencoders inherently capture the relatedness of one task to
another, based on which the most relevant prior model to be
used for training a new expert, with fine-tuning or learningwithout-forgetting, can be selected. We evaluate our method
on image classification and video prediction problems