Fast Hoeffding and McDiarmid Drift Detection Methods for Adaptive
Learning from Evolving Data Streams
PRESENTER:
Ali Pesaranghader
University of Ottawa
ABSTRACT:
Learning from evolving data streams is a challenging task due to the
distributional changes in data, i.e., the 'concept drift' phenomenon.
Learning algorithms have to adapt themselves to the new distributions for
keeping the accuracy of classification high. Drift detection methods, as
the main component of adaptive learning algorithms, are responsible for
detecting concept drifts, with the least delay, as soon as they appear in
data streams. Such methods should also avoid high false positive and false
negative rates while the input data are processed. False positive refers
to false alarms for a concept drift, whereas false negative means ignoring
a real concept drift. False positive entails keeping more resources busy,
whereas false negative causes loss in the classification accuracy.
Additionally, based on the probably approximately correct (PAC) learning
model, a high false positive rate may not let accuracy increase as an
insignificant amount of data would be used for training. The drift
detection methods should not assume the input data, e.g., prediction
results, follow a specific distribution function as the nature of
streaming data is dynamic. Finally, an adaptive learning algorithm has to
obtain a higher accuracy, compared to a non-adaptive algorithm, to be
considered beneficial for a learning task from evolving data streams. To
address these challenges, we introduce three kinds of sliding window-based
methods, which use Hoeffding's and McDiarmid's inequalities, for detecting
drift points in a data stream. We experimentally show that our methods
outperform the state-of-the-art.