Tutorial “ Change and Anomaly Detection in Data Streams ”

Presenter : Giacomo Boracchi, Politecnico di Milano, DEIB

Motivation
Detecting changes and anomalies is very important in many application domains: changes might indicate unforeseen evolution of the process generating the data as well as faults in the sensing apparatus, to name a few examples. Similarly, anomalies are often the most informative samples, as for instance frauds among genuine credit card transactions. When data come in the form of a stream, promptly detecting changes/anomalies allows gathering precious information for understanding the stream dynamics and for activating suitable countermeasures. Not surprisingly, the detection of changes and anomalies is a primary concern in financial analysis, quality inspection, environmental and health-monitoring systems. Change detection plays also a central role in machine learning, being often the first step towards adaptation.

Aims and scope
The tutorial presents a rigorous formulation of the change/anomaly detection problem that fits sequential monitoring, classification and several signal/image analysis applications. The tutorial also illustrates the major approaches in the literature, considering supervised, semi-supervised and unsupervised monitoring scenarios.
Particular emphasis will be given to:
i) learning aspects related to change/anomaly detection and
ii) issues raising in big data scenarios, where data dimension becomes large.
In particular, change/anomaly detection methods for monitoring multivariate data will be discussed, including the popular approach of monitoring the log-likelihood, which will be demonstrated to suffer of detectability loss when data-dimension increases.
The tutorial is accompanied by various examples of real world problems solved by change/anomaly detection methods. These include ECG monitoring solutions for wearable devices, credit card fraud detection systems and image-analysis algorithms for detecting defects in industrial monitoring scenarios.

Tutorial Outline

  • Illustrative examples from real-world applications
  • Problem formulation: Anomaly / Change Detection Problems in a Statistical Framework
  • Solutions in the ideal settings: change/anomaly detection when distributions are known
  • Anomaly detection in more realistic settings: when training data are provided:
    1. Supervised methods: classifiers and the main challenges (class unbalance, concept drift, sampling selection bias) in a fraud-detection context.
    2. Semi-supervised methods: density-based methods, domain-based methods (i.e. one-class svm).
    3. Unsupervised methods: distance-based methods, clustering methods.
  • Change detection in more realistic settings: when training data are provided:
    1. Sequential monitoring: parametric techniques (change-point methods).
    2. Sequential monitoring: nonparametric techniques (change-detection tests and monitoring the log-likelihood).
  • Out of the Random Variable World: change/anomaly detection methods for signals, images.
  • Detectability loss in high-dimensional data: How data dimension affects monitoring the Log-likelihood
  • Concluding remarks


Slides

Change and Anomaly Detection in Data Streams
Giacomo Boracchi Tutorial at IJCNN 2017 The INNS/IEEE International Joint Conference on Neural Networks;
May 14-19, 2017, Anchorage, Alaska, USA
(Slides);

Change Detection in Data Streams: Big Data Challenges
Giacomo Boracchi Tutorial at INNS Conference on Big Data;
October 23rd - 25th, 2016, Thessaloniki, Grece
(Abstract); (Slides);


References

[Alippi et al. 2016] Change Detection in Multivariate Datastreams: Likelihood and Detectability Loss
Cesare Alippi, Giacomo Boracchi, Diego Carrera, Manuel Roveri
, International Joint Conference of Artificial Intelligence (IJCAI) 2016, New York, USA, July 9 - 13
(Preprint), (Original), (BibTeX).

[Carrera et al. 2016 a] ECG Monitoring in Wearable Devices by Sparse Models
Diego Carrera, Beatrice Rossi, Daniele Zambon, Pasqualina Fragneto, and Giacomo Boracchi
, Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery, ECML-PKDD 2016, Riva del Garda, Italy, September 19 - 23, Accepted, 16 pages
(Preprint)

[Carrera et al. 2016 b] Defect Detection in SEM Images of Nanofibrous Materials
Diego Carrera, Fabio Manganini, Giacomo Boracchi, Ettore Lanzarone
, IEEE Transactions on Industrial Informatics -- In Press, 11 pages, doi:10.1109/TII.2016.2641472
(Preprint), (Original), (Dataset),

[Carrera et al. 2016 c] CCM: Controlling the Change Magnitude in High Dimensional Data
Cesare Alippi, Giacomo Boracchi, Diego Carrera
, Proceedings of INNS Conference on Big Data, 2016 , Thessaloniki, Greece, October 23 - 25, 2016, 10 pages
(Preprint), (Slides) (Codes).
This paper has received the Best Regular Paper Award.

[Dal Pozzolo et al. 2015] Credit Card Fraud Detection and Concept-Drift Adaptation with Delayed Supervised Information
Andrea Dal Pozzolo, Giacomo Boracchi, Olivier Caelen, Cesare Alippi and Gianluca Bontempi
, Proceedings of International Joint Conference on Neural Networks IJCNN 2015, Killarney, Irland, July 12 - 17,
(Preprint), (BibTeX), (Original).

[Boracchi and Roveri. 2014] Exploiting Self-Similarity for Change Detection
Giacomo Boracchi, Manuel Roveri
IJCNN 2014 International Joint Conference on Neural Networks, Beijing, China July 6 - 11,
(Preprint)

[Alippi et al. 2011 a] A Hierarchical, Nonparametric Sequential Change-Detection Test
Cesare Alippi, Giacomo Boracchi and Manuel Roveri,
in Proceedings of IJCNN 2011, the International Joint Conference on Neural Networks, San Jose, California July 31 - August 5, 2011. pp 2889 - 2896, doi: 10.1109/IJCNN.2011.6033600
(Preprint), (BibTeX), (Original)

[Alippi et al. 2011 b] A just-in-time adaptive classification system based on the intersection of confidence intervals rule,
Cesare Alippi, Giacomo Boracchi, Manuel Roveri
Neural Networks, Elsevier vol. 24 (2011), pp. 791-800
doi: 10.1016/j.neunet.2011.05.012
(Preprint), (BibTeX), (Original)