Tutorial “ Change and Anomaly Detection in Data Streams ”
Presenter : Giacomo Boracchi, Politecnico di Milano, DEIB
Motivation
Detecting changes and anomalies is very important in many application domains: changes might indicate unforeseen evolution of the process generating the data as well as faults in the sensing apparatus, to name a few examples. Similarly, anomalies are often the most informative samples, as for instance frauds among genuine credit card transactions. When data come in the form of a stream, promptly detecting changes/anomalies allows gathering precious information for understanding the stream dynamics and for activating suitable countermeasures. Not surprisingly, the detection of changes and anomalies is a primary concern in financial analysis, quality inspection, environmental and health-monitoring systems. Change detection plays also a central role in machine learning, being often the first step towards adaptation.
Aims and scope
The tutorial presents a rigorous formulation of the change/anomaly detection problem that fits sequential monitoring, classification and several signal/image analysis applications. The tutorial also illustrates the major approaches in the literature, considering supervised, semi-supervised and unsupervised monitoring scenarios.
Particular emphasis will be given to:
i) learning aspects related to change/anomaly detection and
ii) issues raising in big data scenarios, where data dimension becomes large.
In particular, change/anomaly detection methods for monitoring multivariate data will be discussed, including the popular approach of monitoring the log-likelihood, which will be demonstrated to suffer of detectability loss when data-dimension increases.
The tutorial is accompanied by various examples of real world problems solved by change/anomaly detection methods. These include ECG monitoring solutions for wearable devices, credit card fraud detection systems and image-analysis algorithms for detecting defects in industrial monitoring scenarios.
Tutorial Outline
- Illustrative examples from real-world applications
- Problem formulation: Anomaly / Change Detection Problems in a Statistical Framework
- Solutions in the ideal settings: change/anomaly detection when distributions are known
- Anomaly detection in more realistic settings: when training data are provided:
- Supervised methods: classifiers and the main challenges (class unbalance, concept drift, sampling selection bias) in a fraud-detection context.
- Semi-supervised methods: density-based methods, domain-based methods (i.e. one-class svm).
- Unsupervised methods: distance-based methods, clustering methods.
- Change detection in more realistic settings: when training data are provided:
- Sequential monitoring: parametric techniques (change-point methods).
- Sequential monitoring: nonparametric techniques (change-detection tests and monitoring the log-likelihood).
- Out of the Random Variable World: change/anomaly detection methods for signals, images.
- Detectability loss in high-dimensional data: How data dimension affects monitoring the Log-likelihood
- Concluding remarks
Slides
Change and Anomaly Detection in Data Streams
Giacomo Boracchi
Tutorial at IJCNN 2017 The INNS/IEEE International Joint Conference on Neural Networks;
May 14-19, 2017, Anchorage, Alaska, USA
(Slides);
Change Detection in Data Streams: Big Data Challenges
Giacomo Boracchi
Tutorial at INNS Conference on Big Data;
October 23rd - 25th, 2016, Thessaloniki, Grece
(Abstract); (Slides);
|
References
[Alippi et al. 2016] Change Detection in Multivariate Datastreams: Likelihood and Detectability Loss Cesare Alippi, Giacomo Boracchi, Diego Carrera, Manuel Roveri , International Joint Conference of Artificial Intelligence (IJCAI) 2016, New York, USA, July 9 - 13
(Preprint), (Original), (BibTeX).
[Carrera et al. 2016 a] ECG Monitoring in Wearable Devices by Sparse Models Diego Carrera, Beatrice Rossi, Daniele Zambon, Pasqualina Fragneto, and Giacomo Boracchi , Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery, ECML-PKDD 2016, Riva del Garda, Italy, September 19 - 23, Accepted, 16 pages
(Preprint)
[Carrera et al. 2016 b] Defect Detection in SEM Images of Nanofibrous Materials Diego Carrera, Fabio Manganini, Giacomo Boracchi, Ettore Lanzarone , IEEE Transactions on Industrial Informatics -- In Press, 11 pages, doi:10.1109/TII.2016.2641472
(Preprint),
(Original),
(Dataset),
[Carrera et al. 2016 c] CCM: Controlling the Change Magnitude in High Dimensional Data Cesare Alippi, Giacomo Boracchi, Diego Carrera , Proceedings of INNS Conference on Big Data, 2016 , Thessaloniki, Greece, October 23 - 25, 2016, 10 pages
(Preprint), (Slides) (Codes).
This paper has received the Best Regular Paper Award.
[Dal Pozzolo et al. 2015] Credit Card Fraud Detection and Concept-Drift Adaptation with Delayed Supervised Information Andrea Dal Pozzolo, Giacomo Boracchi, Olivier Caelen, Cesare Alippi and Gianluca Bontempi , Proceedings of International Joint Conference on Neural Networks IJCNN 2015, Killarney, Irland, July 12 - 17,
(Preprint), (BibTeX), (Original).
[Boracchi and Roveri. 2014] Exploiting Self-Similarity for Change Detection Giacomo Boracchi, Manuel Roveri IJCNN 2014 International Joint Conference on Neural Networks, Beijing, China July 6 - 11,
(Preprint)
[Alippi et al. 2011 a] A Hierarchical, Nonparametric Sequential Change-Detection Test
Cesare Alippi, Giacomo Boracchi and Manuel Roveri, in Proceedings of IJCNN 2011, the International Joint Conference on Neural Networks, San Jose, California July 31 - August 5, 2011. pp 2889 - 2896, doi: 10.1109/IJCNN.2011.6033600
(Preprint),
(BibTeX),
(Original)
[Alippi et al. 2011 b] A just-in-time adaptive classification system based on the intersection of confidence intervals rule,
Cesare Alippi, Giacomo Boracchi, Manuel Roveri Neural Networks, Elsevier vol. 24 (2011), pp. 791-800
doi: 10.1016/j.neunet.2011.05.012
(Preprint),
(BibTeX),
(Original)
|