CDM: Class Distribution Monitoring for Concept Drift Detection


The Idea

We introduce Class Distribution Monitoring (CDM), an effective concept-drift detection scheme that monitors the class-conditional distributions of a datastream. Rather than using supervised sample to compute and monitor the classification error - the mainstream approach in concept drift detection - CDM uses supervised samples to monitor each class by a separate change-detection test. Most remarkably, CDM can identify which classes are affected by the concept drift and guarantees control over the expected time before a false alarm, or Average Run Length (ARL0).

Class Distribution Monitoring

Fig.1: CDM is faster at detecting the drift, especially when this affects a subset of classes. CDM also indicates which class triggered the detection.

Class Distribution Monitoring

In Class Distriubtion Monitoring, we employ a separate instance of QuantTree Exponentially Weighted Moving Average (QT-EWMA) [Frittoli et al. 2021] to monitor each class-conditional distribution. QT-EWMA is a nonparametric online change-detection test based on QuantTree histograms [Boracchi et al. 2018] , and is designed to monitor multivariate datastreams.

CDM reports a concept drift after detecting a change in the distribution of at least one class. The main advantages of CDM are:

  • it can detect any relevant drift, including virtual drifts having little impact on the classification error, which are thus not detectable by many concept drift detectors
  • it can detect concept drifts affecting only a subset of classes more promptly than methods that monitor the overall data distribution;
  • it indicates which classes have been affected by concept drift, which might be crucial for diagnostics and adaptation;
  • it effectively controls false alarms by maintaining a target Average Run Length ARL0, namely the expected time before a false alarm, which can be set before monitoring.

Our code is available for download (here)
Poster of our presentation at IJCNN 2022 is available for download here.


References

[Stucchi et al. 2022] Class Distribution Monitoring for Concept Drift Detection
Diego Stucchi, Luca Frittoli, Giacomo Boracchi
IEEE-INNS International Joint Conference on Neural Networks (IJCNN) 2022
(Preprint), (Code)

[Frittoli et al. 2021] Change Detection in Multivariate Datastreams Controlling False Alarms
Luca Frittoli, Diego Carrera, Giacomo Boracchi
Proceedings of European Conference on Machine Learning (ECML) 2021
(Preprint), (Code), (Supplementary Material), (Poster)

[Boracchi et al. 2018]QuantTree: Histograms for Change Detection in Multivariate Data Streams
Giacomo Boracchi, Diego Carrera, Cristiano Cervellera, Danilo Maccio'
, International Conference on Machine Learning (ICML) 2018 -- Accepted, 8 pages, 2018
(Paper), (Source Code), (Bibtex).