Abstract
Learning from data streams is among the most vital
contemporary fields in machine learning and data
mining. Streams pose new challenges to learning
systems, due to their volume and velocity, as well
as ever-changing nature caused by concept drift.
Vast majority of works for data streams assume a
fully supervised learning scenario, having an unrestricted access to class labels. This assumption
does not hold in real-world applications, where obtaining ground truth is costly and time-consuming.
Therefore, we need to carefully select which instances should be labeled, as usually we are working under a strict label budget. In this paper, we
propose a novel active learning approach based on
ensemble algorithms that is capable of using multiple base classifiers during the label query process. It is a plug-in solution, capable of working
with most of existing streaming ensemble classi-
fiers. We realize this process as a Multi-Armed
Bandit problem, obtaining an efficient and adaptive
ensemble active learning procedure by selecting the
most competent classifier from the pool for each
query. In order to better adapt to concept drifts, we
guide our instance selection by measuring the generalization capabilities of our classifiers. This adaptive solution leads not only to better instance selection under sparse access to class labels, but also
to improved adaptation to various types of concept
drift and increasing the diversity of the underlying
ensemble classifier