Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Douglas P. Shepherd

Mamba time series forecasting with uncertainty propagation

Mar 13, 2025

Pedro Pessoa, Paul Campitelli, Douglas P. Shepherd, S. Banu Ozkan, Steve Pressé

Abstract:State space models, such as Mamba, have recently garnered attention in time series forecasting due to their ability to capture sequence patterns. However, in electricity consumption benchmarks, Mamba forecasts exhibit a mean error of approximately 8\%. Similarly, in traffic occupancy benchmarks, the mean error reaches 18\%. This discrepancy leaves us to wonder whether the prediction is simply inaccurate or falls within error given spread in historical data. To address this limitation, we propose a method to quantify the predictive uncertainty of Mamba forecasts. Here, we propose a dual-network framework based on the Mamba architecture for probabilistic forecasting, where one network generates point forecasts while the other estimates predictive uncertainty by modeling variance. We abbreviate our tool, Mamba with probabilistic time series forecasting, as Mamba-ProbTSF and the code for its implementation is available on GitHub (https://github.com/PessoaP/Mamba-ProbTSF). Evaluating this approach on synthetic and real-world benchmark datasets, we find Kullback-Leibler divergence between the learned distributions and the data--which, in the limit of infinite data, should converge to zero if the model correctly captures the underlying probability distribution--reduced to the order of $10^{-3}$ for synthetic data and $10^{-1}$ for real-world benchmark, demonstrating its effectiveness. We find that in both the electricity consumption and traffic occupancy benchmark, the true trajectory stays within the predicted uncertainty interval at the two-sigma level about 95\% of the time. We end with a consideration of potential limitations, adjustments to improve performance, and considerations for applying this framework to processes for purely or largely stochastic dynamics where the stochastic changes accumulate, as observed for example in pure Brownian motion or molecular dynamics trajectories.

Via

Access Paper or Ask Questions

Re-thinking Richardson-Lucy without Iteration Cutoffs: Physically Motivated Bayesian Deconvolution

Nov 01, 2024

Zachary H. Hendrix, Peter T. Brown, Tim Flanagan, Douglas P. Shepherd, Ayush Saurabh, Steve Pressé

Abstract:Richardson-Lucy deconvolution is widely used to restore images from degradation caused by the broadening effects of a point spread function and corruption by photon shot noise, in order to recover an underlying object. In practice, this is achieved by iteratively maximizing a Poisson emission likelihood. However, the RL algorithm is known to prefer sparse solutions and overfit noise, leading to high-frequency artifacts. The structure of these artifacts is sensitive to the number of RL iterations, and this parameter is typically hand-tuned to achieve reasonable perceptual quality of the inferred object. Overfitting can be mitigated by introducing tunable regularizers or other ad hoc iteration cutoffs in the optimization as otherwise incorporating fully realistic models can introduce computational bottlenecks. To resolve these problems, we present Bayesian deconvolution, a rigorous deconvolution framework that combines a physically accurate image formation model avoiding the challenges inherent to the RL approach. Our approach achieves deconvolution while satisfying the following desiderata: I deconvolution is performed in the spatial domain (as opposed to the frequency domain) where all known noise sources are accurately modeled and integrated in the spirit of providing full probability distributions over the density of the putative object recovered; II the probability distribution is estimated without making assumptions on the sparsity or continuity of the underlying object; III unsupervised inference is performed and converges to a stable solution with no user-dependent parameter tuning or iteration cutoff; IV deconvolution produces strictly positive solutions; and V implementation is amenable to fast, parallelizable computation.

* 5 figures

Via

Access Paper or Ask Questions