Alert button
Picture for Laurent Callot

Laurent Callot

Alert button

Criteria for Classifying Forecasting Methods

Dec 07, 2022
Tim Januschowski, Jan Gasthaus, Yuyang Wang, David Salinas, Valentin Flunkert, Michael Bohlke-Schneider, Laurent Callot

Figure 1 for Criteria for Classifying Forecasting Methods
Figure 2 for Criteria for Classifying Forecasting Methods
Figure 3 for Criteria for Classifying Forecasting Methods

Classifying forecasting methods as being either of a "machine learning" or "statistical" nature has become commonplace in parts of the forecasting literature and community, as exemplified by the M4 competition and the conclusion drawn by the organizers. We argue that this distinction does not stem from fundamental differences in the methods assigned to either class. Instead, this distinction is probably of a tribal nature, which limits the insights into the appropriateness and effectiveness of different forecasting methods. We provide alternative characteristics of forecasting methods which, in our view, allow to draw meaningful conclusions. Further, we discuss areas of forecasting which could benefit most from cross-pollination between the ML and the statistics communities.

Viaarxiv icon

SpectraNet: Multivariate Forecasting and Imputation under Distribution Shifts and Missing Data

Oct 25, 2022
Cristian Challu, Peihong Jiang, Ying Nian Wu, Laurent Callot

Figure 1 for SpectraNet: Multivariate Forecasting and Imputation under Distribution Shifts and Missing Data
Figure 2 for SpectraNet: Multivariate Forecasting and Imputation under Distribution Shifts and Missing Data
Figure 3 for SpectraNet: Multivariate Forecasting and Imputation under Distribution Shifts and Missing Data
Figure 4 for SpectraNet: Multivariate Forecasting and Imputation under Distribution Shifts and Missing Data

In this work, we tackle two widespread challenges in real applications for time-series forecasting that have been largely understudied: distribution shifts and missing data. We propose SpectraNet, a novel multivariate time-series forecasting model that dynamically infers a latent space spectral decomposition to capture current temporal dynamics and correlations on the recent observed history. A Convolution Neural Network maps the learned representation by sequentially mixing its components and refining the output. Our proposed approach can simultaneously produce forecasts and interpolate past observations and can, therefore, greatly simplify production systems by unifying imputation and forecasting tasks into a single model. SpectraNet achieves SoTA performance simultaneously on both tasks on five benchmark datasets, compared to forecasting and imputation models, with up to 92% fewer parameters and comparable training times. On settings with up to 80% missing data, SpectraNet has average performance improvements of almost 50% over the second-best alternative. Our code is available at https://github.com/cchallu/spectranet.

Viaarxiv icon

Unsupervised Model Selection for Time-series Anomaly Detection

Oct 03, 2022
Mononito Goswami, Cristian Challu, Laurent Callot, Lenon Minorics, Andrey Kan

Figure 1 for Unsupervised Model Selection for Time-series Anomaly Detection
Figure 2 for Unsupervised Model Selection for Time-series Anomaly Detection
Figure 3 for Unsupervised Model Selection for Time-series Anomaly Detection
Figure 4 for Unsupervised Model Selection for Time-series Anomaly Detection

Anomaly detection in time-series has a wide range of practical applications. While numerous anomaly detection methods have been proposed in the literature, a recent survey concluded that no single method is the most accurate across various datasets. To make matters worse, anomaly labels are scarce and rarely available in practice. The practical problem of selecting the most accurate model for a given dataset without labels has received little attention in the literature. This paper answers this question i.e. Given an unlabeled dataset and a set of candidate anomaly detectors, how can we select the most accurate model? To this end, we identify three classes of surrogate (unsupervised) metrics, namely, prediction error, model centrality, and performance on injected synthetic anomalies, and show that some metrics are highly correlated with standard supervised anomaly detection performance metrics such as the $F_1$ score, but to varying degrees. We formulate metric combination with multiple imperfect surrogate metrics as a robust rank aggregation problem. We then provide theoretical justification behind the proposed approach. Large-scale experiments on multiple real-world datasets demonstrate that our proposed unsupervised approach is as effective as selecting the most accurate model based on partially labeled data.

Viaarxiv icon

Robust Projection based Anomaly Extraction (RPE) in Univariate Time-Series

May 31, 2022
Mostafa Rahmani, Anoop Deoras, Laurent Callot

Figure 1 for Robust Projection based Anomaly Extraction (RPE) in Univariate Time-Series
Figure 2 for Robust Projection based Anomaly Extraction (RPE) in Univariate Time-Series
Figure 3 for Robust Projection based Anomaly Extraction (RPE) in Univariate Time-Series
Figure 4 for Robust Projection based Anomaly Extraction (RPE) in Univariate Time-Series

This paper presents a novel, closed-form, and data/computation efficient online anomaly detection algorithm for time-series data. The proposed method, dubbed RPE, is a window-based method and in sharp contrast to the existing window-based methods, it is robust to the presence of anomalies in its window and it can distinguish the anomalies in time-stamp level. RPE leverages the linear structure of the trajectory matrix of the time-series and employs a robust projection step which makes the algorithm able to handle the presence of multiple arbitrarily large anomalies in its window. A closed-form/non-iterative algorithm for the robust projection step is provided and it is proved that it can identify the corrupted time-stamps. RPE is a great candidate for the applications where a large training data is not available which is the common scenario in the area of time-series. An extensive set of numerical experiments show that RPE can outperform the existing approaches with a notable margin.

Viaarxiv icon

Deep Generative model with Hierarchical Latent Factors for Time Series Anomaly Detection

Feb 25, 2022
Cristian Challu, Peihong Jiang, Ying Nian Wu, Laurent Callot

Figure 1 for Deep Generative model with Hierarchical Latent Factors for Time Series Anomaly Detection
Figure 2 for Deep Generative model with Hierarchical Latent Factors for Time Series Anomaly Detection
Figure 3 for Deep Generative model with Hierarchical Latent Factors for Time Series Anomaly Detection
Figure 4 for Deep Generative model with Hierarchical Latent Factors for Time Series Anomaly Detection

Multivariate time series anomaly detection has become an active area of research in recent years, with Deep Learning models outperforming previous approaches on benchmark datasets. Among reconstruction-based models, most previous work has focused on Variational Autoencoders and Generative Adversarial Networks. This work presents DGHL, a new family of generative models for time series anomaly detection, trained by maximizing the observed likelihood by posterior sampling and alternating back-propagation. A top-down Convolution Network maps a novel hierarchical latent space to time series windows, exploiting temporal dynamics to encode information efficiently. Despite relying on posterior sampling, it is computationally more efficient than current approaches, with up to 10x shorter training times than RNN based models. Our method outperformed current state-of-the-art models on four popular benchmark datasets. Finally, DGHL is robust to variable features between entities and accurate even with large proportions of missing values, settings with increasing relevance with the advent of IoT. We demonstrate the superior robustness of DGHL with novel occlusion experiments in this literature. Our code is available at https://github.com/cchallu/dghl.

* accepted at AISTATS 2022 
Viaarxiv icon

Testing Granger Non-Causality in Panels with Cross-Sectional Dependencies

Feb 23, 2022
Lenon Minorics, Caner Turkmen, David Kernert, Patrick Bloebaum, Laurent Callot, Dominik Janzing

Figure 1 for Testing Granger Non-Causality in Panels with Cross-Sectional Dependencies
Figure 2 for Testing Granger Non-Causality in Panels with Cross-Sectional Dependencies
Figure 3 for Testing Granger Non-Causality in Panels with Cross-Sectional Dependencies
Figure 4 for Testing Granger Non-Causality in Panels with Cross-Sectional Dependencies

This paper proposes a new approach for testing Granger non-causality on panel data. Instead of aggregating panel member statistics, we aggregate their corresponding p-values and show that the resulting p-value approximately bounds the type I error by the chosen significance level even if the panel members are dependent. We compare our approach against the most widely used Granger causality algorithm on panel data and show that our approach yields lower FDR at the same power for large sample sizes and panels with cross-sectional dependencies. Finally, we examine COVID-19 data about confirmed cases and deaths measured in countries/regions worldwide and show that our approach is able to discover the true causal relation between confirmed cases and deaths while state-of-the-art approaches fail.

Viaarxiv icon

Online Time Series Anomaly Detection with State Space Gaussian Processes

Jan 18, 2022
Christian Bock, François-Xavier Aubet, Jan Gasthaus, Andrey Kan, Ming Chen, Laurent Callot

Figure 1 for Online Time Series Anomaly Detection with State Space Gaussian Processes
Figure 2 for Online Time Series Anomaly Detection with State Space Gaussian Processes
Figure 3 for Online Time Series Anomaly Detection with State Space Gaussian Processes
Figure 4 for Online Time Series Anomaly Detection with State Space Gaussian Processes

We propose r-ssGPFA, an unsupervised online anomaly detection model for uni- and multivariate time series building on the efficient state space formulation of Gaussian processes. For high-dimensional time series, we propose an extension of Gaussian process factor analysis to identify the common latent processes of the time series, allowing us to detect anomalies efficiently in an interpretable manner. We gain explainability while speeding up computations by imposing an orthogonality constraint on the mapping from the latent to the observed. Our model's robustness is improved by using a simple heuristic to skip Kalman updates when encountering anomalous observations. We investigate the behaviour of our model on synthetic data and show on standard benchmark datasets that our method is competitive with state-of-the-art methods while being computationally cheaper.

Viaarxiv icon

Online false discovery rate control for anomaly detection in time series

Dec 06, 2021
Quentin Rebjock, Barış Kurt, Tim Januschowski, Laurent Callot

Figure 1 for Online false discovery rate control for anomaly detection in time series
Figure 2 for Online false discovery rate control for anomaly detection in time series
Figure 3 for Online false discovery rate control for anomaly detection in time series
Figure 4 for Online false discovery rate control for anomaly detection in time series

This article proposes novel rules for false discovery rate control (FDRC) geared towards online anomaly detection in time series. Online FDRC rules allow to control the properties of a sequence of statistical tests. In the context of anomaly detection, the null hypothesis is that an observation is normal and the alternative is that it is anomalous. FDRC rules allow users to target a lower bound on precision in unsupervised settings. The methods proposed in this article overcome short-comings of previous FDRC rules in the context of anomaly detection, in particular ensuring that power remains high even when the alternative is exceedingly rare (typical in anomaly detection) and the test statistics are serially dependent (typical in time series). We show the soundness of these rules in both theory and experiments.

Viaarxiv icon

Spliced Binned-Pareto Distribution for Robust Modeling of Heavy-tailed Time Series

Jun 21, 2021
Elena Ehrlich, Laurent Callot, François-Xavier Aubet

Figure 1 for Spliced Binned-Pareto Distribution for Robust Modeling of Heavy-tailed Time Series
Figure 2 for Spliced Binned-Pareto Distribution for Robust Modeling of Heavy-tailed Time Series
Figure 3 for Spliced Binned-Pareto Distribution for Robust Modeling of Heavy-tailed Time Series

This work proposes a novel method to robustly and accurately model time series with heavy-tailed noise, in non-stationary scenarios. In many practical application time series have heavy-tailed noise that significantly impacts the performance of classical forecasting models; in particular, accurately modeling a distribution over extreme events is crucial to performing accurate time series anomaly detection. We propose a Spliced Binned-Pareto distribution which is both robust to extreme observations and allows accurate modeling of the full distribution. Our method allows the capture of time dependencies in the higher order moments of the distribution such as the tail heaviness. We compare the robustness and the accuracy of the tail estimation of our method to other state of the art methods on Twitter mentions count time series.

* Accepted at RobustWorkshop@ICLR2021: <https://sites.google.com/connect.hku.hk/robustml-2021/accepted-papers/paper-041> 
Viaarxiv icon