QTEWMA: QuantTree for change detection in multivariate datastreams
Abstract We introduce QuantTree Exponentially Weighted Moving Average (QTEWMA), an effective online and nonparametric changedetection algorithm for multivariate datastreams. We model the initial data distribution by a QuantTree histogram [Boracchi et al. 2018] , and define a novel statistic based on the Exponential Weighted Moving Average (EWMA) chart. The properties of QuantTree guarantee that the distribution of our statistic is independent from the data distribution, enabling us to define thresholds to control the expected time before a false alarm, or Average Run Length (ARL0). We also introduce QTEWMAupdate, a variant specifically designed to cope with small training sets. Our experiments demonstrate that QTEWMA and QTEWMAupdate are most powerful sequential monitoring schemes for multivariate datastream that can control ARL0.
Fig.1: The EWMA statistic of QTEWMA enables change detection in multivariate datastreams (here the datstream is scalar for the illustration sake). QTEWMA QTEWMA is a nonparametric online changedetection test based on QuantTree [Boracchi et al. 2018] , and is designed to monitor multivariate datastreams. In particular, QTEWMA models the initial data distribution from a training set by constructing a QuantTree histogram. Then, QTEWMA monitors the proportion of samples from the datastream falling in each bin of the histogram by Exponentially Weighted Moving Average (EWMA) statistics. Finally, the QTEWMA statistic measures the overall deviation of these EWMA statistics from their expected values and a change is reported when this statistic exceeds a threshold. Thanks to the properties of QuantTree, the distribution of the QTEWMA statistic does not depend on the data distribution, so we can compute thresholds that control the ARL0 on any datastream using Monte Carlo simulations. QTEWMAupdate When the training set is small, a QuantTree histogram cannot adequately model the initial distribution since the estimated bin probabilities might be distant from the actual ones. To overcome this issue, we propose QTEWMAupdate, a modified QTEWMA where we incrementally update the estimated bin probabilities during monitoring using the incoming data. The properties of QuantTree guarantee that also the distribution of the QTEWMAupdate statistic does not depend on the data distribution, thus we can use the same procedure designed for QTEWMA to compute thresholds controlling the ARL0 on any datastream. The main advantages of QTEWMA and QTEWMAupdate:
Our code is available for download here

References [Frittoli et al. 2022] Nonparametric and Online Change Detection in Multivariate Datastreams using QuantTree [Frittoli et al. 2021] Change Detection in Multivariate Datastreams Controlling False Alarms [Boracchi et al. 2018] QuantTree: Histograms for Change Detection in Multivariate Data Streams 