Adaptive concept drift detection
Citations Over TimeTop 10% of 2009 papers
Abstract
Abstract An established method to detect concept drift in data streams is to perform statistical hypothesis testing on the multivariate data in the stream. The statistical theory offers rank‐based statistics for this task. However, these statistics depend on a fixed set of characteristics of the underlying distribution. Thus, they work well whenever the change in the underlying distribution affects the properties measured by the statistic, but they perform not very well, if the drift influences the characteristics caught by the test statistic only to a small degree. To address this problem, we show how uniform convergence bounds in learning theory can be adjusted for adaptive concept drift detection. In particular, we present three novel drift detection tests, whose test statistics are dynamically adapted to match the actual data at hand. The first one is based on a rank statistic on density estimates for a binary representation of the data, the second compares average margins of a linear classifier induced by the 1‐norm support vector machine (SVM), and the last one is based on the average zero‐one, sigmoid or stepwise linear error rate of an SVM classifier. We compare these new approaches with the maximum mean discrepancy method, the StreamKrimp system, and the multivariate Wald–Wolfowitz test. The results indicate that the new methods are able to detect concept drift reliably and that they perform favorably in a precision‐recall analysis. Copyright © 2009 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 2: 311‐327, 2009
Related Papers
- → Detecting group concept drift from multiple data streams(2022)65 cited
- → An Efficient Approach to Detect Concept Drifts in Data Streams(2017)10 cited
- → Towards Online Concept Drift Detection with Feature Selection for Data Stream Classification(2016)9 cited
- An Ensemble Classification Framework toEvolving Data Streams(2014)
- → A Review of Classification and Novel Class Detection Technique of Data Streams(2012)