Abstract
We study the problem of improving a machine
learning model by identifying and using features
that are not in the training set. This is applicable
to machine learning systems deployed in an open
environment. For example, a prediction model
built on a set of sensors may be improved when
it has access to new and relevant sensors at test
time. To effectively use new features, we propose a novel approach that learns a model over
both the original and new features, with the goal
of making the joint distribution of features and
predicted labels similar to that in the training set.
Our approach can naturally leverage labels associated with these new features when they are accessible. We present an efficient optimization algorithm for learning the model parameters and empirically evaluate the approach on several regression
and classification tasks. Experimental results show
that our approach can achieve on average 11.2%
improvement over baselines