Alert button
Picture for Kashif Rasul

Kashif Rasul

Alert button

Deep Learning based Forecasting: a case study from the online fashion industry

May 23, 2023
Manuel Kunz, Stefan Birr, Mones Raslan, Lei Ma, Zhen Li, Adele Gouttes, Mateusz Koren, Tofigh Naghibi, Johannes Stephan, Mariia Bulycheva, Matthias Grzeschik, Armin Kekić, Michael Narodovitch, Kashif Rasul, Julian Sieber, Tim Januschowski

Figure 1 for Deep Learning based Forecasting: a case study from the online fashion industry
Figure 2 for Deep Learning based Forecasting: a case study from the online fashion industry
Figure 3 for Deep Learning based Forecasting: a case study from the online fashion industry
Figure 4 for Deep Learning based Forecasting: a case study from the online fashion industry

Demand forecasting in the online fashion industry is particularly amendable to global, data-driven forecasting models because of the industry's set of particular challenges. These include the volume of data, the irregularity, the high amount of turn-over in the catalog and the fixed inventory assumption. While standard deep learning forecasting approaches cater for many of these, the fixed inventory assumption requires a special treatment via controlling the relationship between price and demand closely. In this case study, we describe the data and our modelling approach for this forecasting problem in detail and present empirical results that highlight the effectiveness of our approach.

Viaarxiv icon

Provably Convergent Schrödinger Bridge with Applications to Probabilistic Time Series Imputation

May 12, 2023
Yu Chen, Wei Deng, Shikai Fang, Fengpei Li, Nicole Tianjiao Yang, Yikai Zhang, Kashif Rasul, Shandian Zhe, Anderson Schneider, Yuriy Nevmyvaka

Figure 1 for Provably Convergent Schrödinger Bridge with Applications to Probabilistic Time Series Imputation
Figure 2 for Provably Convergent Schrödinger Bridge with Applications to Probabilistic Time Series Imputation
Figure 3 for Provably Convergent Schrödinger Bridge with Applications to Probabilistic Time Series Imputation
Figure 4 for Provably Convergent Schrödinger Bridge with Applications to Probabilistic Time Series Imputation

The Schr\"odinger bridge problem (SBP) is gaining increasing attention in generative modeling and showing promising potential even in comparison with the score-based generative models (SGMs). SBP can be interpreted as an entropy-regularized optimal transport problem, which conducts projections onto every other marginal alternatingly. However, in practice, only approximated projections are accessible and their convergence is not well understood. To fill this gap, we present a first convergence analysis of the Schr\"odinger bridge algorithm based on approximated projections. As for its practical applications, we apply SBP to probabilistic time series imputation by generating missing values conditioned on observed data. We show that optimizing the transport cost improves the performance and the proposed algorithm achieves the state-of-the-art result in healthcare and environmental data while exhibiting the advantage of exploring both temporal and feature patterns in probabilistic time series imputation.

* Accepted by ICML 2023 
Viaarxiv icon

Modeling Temporal Data as Continuous Functions with Process Diffusion

Nov 04, 2022
Marin Biloš, Kashif Rasul, Anderson Schneider, Yuriy Nevmyvaka, Stephan Günnemann

Figure 1 for Modeling Temporal Data as Continuous Functions with Process Diffusion
Figure 2 for Modeling Temporal Data as Continuous Functions with Process Diffusion
Figure 3 for Modeling Temporal Data as Continuous Functions with Process Diffusion
Figure 4 for Modeling Temporal Data as Continuous Functions with Process Diffusion

Temporal data like time series are often observed at irregular intervals which is a challenging setting for existing machine learning methods. To tackle this problem, we view such data as samples from some underlying continuous function. We then define a diffusion-based generative model that adds noise from a predefined stochastic process while preserving the continuity of the resulting underlying function. A neural network is trained to reverse this process which allows us to sample new realizations from the learned distribution. We define suitable stochastic processes as noise sources and introduce novel denoising and score-matching models on processes. Further, we show how to apply this approach to the multivariate probabilistic forecasting and imputation tasks. Through our extensive experiments, we demonstrate that our method outperforms previous models on synthetic and real-world datasets.

Viaarxiv icon

Intrinsic Anomaly Detection for Multi-Variate Time Series

Jun 29, 2022
Stephan Rabanser, Tim Januschowski, Kashif Rasul, Oliver Borchert, Richard Kurle, Jan Gasthaus, Michael Bohlke-Schneider, Nicolas Papernot, Valentin Flunkert

Figure 1 for Intrinsic Anomaly Detection for Multi-Variate Time Series
Figure 2 for Intrinsic Anomaly Detection for Multi-Variate Time Series
Figure 3 for Intrinsic Anomaly Detection for Multi-Variate Time Series
Figure 4 for Intrinsic Anomaly Detection for Multi-Variate Time Series

We introduce a novel, practically relevant variation of the anomaly detection problem in multi-variate time series: intrinsic anomaly detection. It appears in diverse practical scenarios ranging from DevOps to IoT, where we want to recognize failures of a system that operates under the influence of a surrounding environment. Intrinsic anomalies are changes in the functional dependency structure between time series that represent an environment and time series that represent the internal state of a system that is placed in said environment. We formalize this problem, provide under-studied public and new purpose-built data sets for it, and present methods that handle intrinsic anomaly detection. These address the short-coming of existing anomaly detection methods that cannot differentiate between expected changes in the system's state and unexpected ones, i.e., changes in the system that deviate from the environment's influence. Our most promising approach is fully unsupervised and combines adversarial learning and time series representation learning, thereby addressing problems such as label sparsity and subjectivity, while allowing to navigate and improve notoriously problematic anomaly detection data sets.

Viaarxiv icon

VQ-AR: Vector Quantized Autoregressive Probabilistic Time Series Forecasting

May 31, 2022
Kashif Rasul, Young-Jin Park, Max Nihlén Ramström, Kyung-Min Kim

Figure 1 for VQ-AR: Vector Quantized Autoregressive Probabilistic Time Series Forecasting
Figure 2 for VQ-AR: Vector Quantized Autoregressive Probabilistic Time Series Forecasting
Figure 3 for VQ-AR: Vector Quantized Autoregressive Probabilistic Time Series Forecasting
Figure 4 for VQ-AR: Vector Quantized Autoregressive Probabilistic Time Series Forecasting

Time series models aim for accurate predictions of the future given the past, where the forecasts are used for important downstream tasks like business decision making. In practice, deep learning based time series models come in many forms, but at a high level learn some continuous representation of the past and use it to output point or probabilistic forecasts. In this paper, we introduce a novel autoregressive architecture, VQ-AR, which instead learns a \emph{discrete} set of representations that are used to predict the future. Extensive empirical comparison with other competitive deep learning models shows that surprisingly such a discrete set of representations gives state-of-the-art or equivalent results on a wide variety of time series datasets. We also highlight the shortcomings of this approach, explore its zero-shot generalization capabilities, and present an ablation study on the number of representations. The full source code of the method will be available at the time of publication with the hope that researchers can further investigate this important but overlooked inductive bias for the time series domain.

Viaarxiv icon

Probabilistic Time Series Forecasting with Implicit Quantile Networks

Jul 08, 2021
Adèle Gouttes, Kashif Rasul, Mateusz Koren, Johannes Stephan, Tofigh Naghibi

Figure 1 for Probabilistic Time Series Forecasting with Implicit Quantile Networks
Figure 2 for Probabilistic Time Series Forecasting with Implicit Quantile Networks
Figure 3 for Probabilistic Time Series Forecasting with Implicit Quantile Networks
Figure 4 for Probabilistic Time Series Forecasting with Implicit Quantile Networks

Here, we propose a general method for probabilistic time series forecasting. We combine an autoregressive recurrent neural network to model temporal dynamics with Implicit Quantile Networks to learn a large class of distributions over a time-series target. When compared to other probabilistic neural forecasting models on real- and simulated data, our approach is favorable in terms of point-wise prediction accuracy as well as on estimating the underlying temporal distribution.

* Accepted at the ICML 2021 Time Series Workshop 
Viaarxiv icon

Autoregressive Denoising Diffusion Models for Multivariate Probabilistic Time Series Forecasting

Feb 02, 2021
Kashif Rasul, Calvin Seward, Ingmar Schuster, Roland Vollgraf

Figure 1 for Autoregressive Denoising Diffusion Models for Multivariate Probabilistic Time Series Forecasting
Figure 2 for Autoregressive Denoising Diffusion Models for Multivariate Probabilistic Time Series Forecasting
Figure 3 for Autoregressive Denoising Diffusion Models for Multivariate Probabilistic Time Series Forecasting
Figure 4 for Autoregressive Denoising Diffusion Models for Multivariate Probabilistic Time Series Forecasting

In this work, we propose \texttt{TimeGrad}, an autoregressive model for multivariate probabilistic time series forecasting which samples from the data distribution at each time step by estimating its gradient. To this end, we use diffusion probabilistic models, a class of latent variable models closely connected to score matching and energy-based methods. Our model learns gradients by optimizing a variational bound on the data likelihood and at inference time converts white noise into a sample of the distribution of interest through a Markov chain using Langevin sampling. We demonstrate experimentally that the proposed autoregressive denoising diffusion model is the new state-of-the-art multivariate probabilistic forecasting method on real-world data sets with thousands of correlated dimensions. We hope that this method is a useful tool for practitioners and lays the foundation for future research in this area.

Viaarxiv icon

Multi-variate Probabilistic Time Series Forecasting via Conditioned Normalizing Flows

Feb 14, 2020
Kashif Rasul, Abdul-Saboor Sheikh, Ingmar Schuster, Urs Bergmann, Roland Vollgraf

Figure 1 for Multi-variate Probabilistic Time Series Forecasting via Conditioned Normalizing Flows
Figure 2 for Multi-variate Probabilistic Time Series Forecasting via Conditioned Normalizing Flows
Figure 3 for Multi-variate Probabilistic Time Series Forecasting via Conditioned Normalizing Flows
Figure 4 for Multi-variate Probabilistic Time Series Forecasting via Conditioned Normalizing Flows

Time series forecasting is often fundamental to scientific and engineering problems and enables decision making. With ever increasing data set sizes, a trivial solution to scale up predictions is to assume independence between interacting time series. However, modeling statistical dependencies can improve accuracy and enable analysis of interaction effects. Deep learning methods are well suited for this problem, but multi-variate models often assume a simple parametric distribution and do not scale to high dimensions. In this work we model the multi-variate temporal dynamics of time series via an autoregressive deep learning model, where the data distribution is represented by a conditioned normalizing flow. This combination retains the power of autoregressive models, such as good performance in extrapolation into the future, with the flexibility of flows as a general purpose high-dimensional distribution model, while remaining computationally tractable. We show that it improves over the state-of-the-art for standard metrics on many real-world data sets with several thousand interacting time-series.

Viaarxiv icon

Set Flow: A Permutation Invariant Normalizing Flow

Sep 06, 2019
Kashif Rasul, Ingmar Schuster, Roland Vollgraf, Urs Bergmann

Figure 1 for Set Flow: A Permutation Invariant Normalizing Flow
Figure 2 for Set Flow: A Permutation Invariant Normalizing Flow
Figure 3 for Set Flow: A Permutation Invariant Normalizing Flow
Figure 4 for Set Flow: A Permutation Invariant Normalizing Flow

We present a generative model that is defined on finite sets of exchangeable, potentially high dimensional, data. As the architecture is an extension of RealNVPs, it inherits all its favorable properties, such as being invertible and allowing for exact log-likelihood evaluation. We show that this architecture is able to learn finite non-i.i.d. set data distributions, learn statistical dependencies between entities of the set and is able to train and sample with variable set sizes in a computationally efficient manner. Experiments on 3D point clouds show state-of-the art likelihoods.

Viaarxiv icon