Forecasting chaotic systems is a cornerstone challenge in many scientific fields, complicated by the exponential amplification of even infinitesimal prediction errors. Modern machine learning approaches often falter due to two opposing pitfalls: over-specializing on a single, well-known chaotic system (e.g., Lorenz-63), which limits generalizability, or indiscriminately mixing vast, unrelated time-series, which prevents the model from learning the nuances of any specific dynamical regime. We propose Curriculum Chaos Forecasting (CCF), a training paradigm that bridges this gap. CCF organizes training data based on fundamental principles of dynamical systems theory, creating a curriculum that progresses from simple, periodic behaviors to highly complex, chaotic dynamics. We quantify complexity using the largest Lyapunov exponent and attractor dimension, two well-established metrics of chaos. By first training a sequence model on predictable systems and gradually introducing more chaotic trajectories, CCF enables the model to build a robust and generalizable representation of dynamical behaviors. We curate a library of over 50 synthetic ODE/PDE systems to build this curriculum. Our experiments show that pre-training with CCF significantly enhances performance on unseen, real-world benchmarks. On datasets including Sunspot numbers, electricity demand, and human ECG signals, CCF extends the valid prediction horizon by up to 40% compared to random-order training and more than doubles it compared to training on real-world data alone. We demonstrate that this benefit is consistent across various neural architectures (GRU, Transformer) and provide extensive ablations to validate the importance of the curriculum's structure.
We present a probabilistic framework for modeling structured spatiotemporal dynamics from sparse observations, focusing on cardiac motion. Our approach integrates neural ordinary differential equations (NODEs), graph neural networks (GNNs), and neural processes into a unified model that captures uncertainty, temporal continuity, and anatomical structure. We represent dynamic systems as spatiotemporal multiplex graphs and model their latent trajectories using a GNN-parameterized vector field. Given the sparse context observations at node and edge levels, the model infers a distribution over latent initial states and control variables, enabling both interpolation and extrapolation of trajectories. We validate the method on three synthetic dynamical systems (coupled pendulum, Lorenz attractor, and Kuramoto oscillators) and two real-world cardiac imaging datasets - ACDC (N=150) and UK Biobank (N=526) - demonstrating accurate reconstruction, extrapolation, and disease classification capabilities. The model accurately reconstructs trajectories and extrapolates future cardiac cycles from a single observed cycle. It achieves state-of-the-art results on the ACDC classification task (up to 99% accuracy), and detects atrial fibrillation in UK Biobank subjects with competitive performance (up to 67% accuracy). This work introduces a flexible approach for analyzing cardiac motion and offers a foundation for graph-based learning in structured biomedical spatiotemporal time-series data.
Deep Operator Networks (DeepONets) have recently emerged as powerful data-driven frameworks for learning nonlinear operators, particularly suited for approximating solutions to partial differential equations (PDEs). Despite their promising capabilities, the standard implementation of DeepONets, which typically employs fully connected linear layers in the trunk network, can encounter limitations in capturing complex spatial structures inherent to various PDEs. To address this, we introduce Fourier-embedded trunk networks within the DeepONet architecture, leveraging random Fourier feature mappings to enrich spatial representation capabilities. Our proposed Fourier-embedded DeepONet, FEDONet demonstrates superior performance compared to the traditional DeepONet across a comprehensive suite of PDE-driven datasets, including the two-dimensional Poisson equation, Burgers' equation, the Lorenz-63 chaotic system, Eikonal equation, Allen-Cahn equation, Kuramoto-Sivashinsky equation, and the Lorenz-96 system. Empirical evaluations of FEDONet consistently show significant improvements in solution reconstruction accuracy, with average relative L2 performance gains ranging between 2-3x compared to the DeepONet baseline. This study highlights the effectiveness of Fourier embeddings in enhancing neural operator learning, offering a robust and broadly applicable methodology for PDE surrogate modeling.
This work addresses the challenge of forecasting urban water dynamics by developing a multi-input, multi-output deep learning model that incorporates both endogenous variables (e.g., water height or discharge) and exogenous factors (e.g., precipitation history and forecast reports). Unlike conventional forecasting, the proposed model, AquaCast, captures both inter-variable and temporal dependencies across all inputs, while focusing forecast solely on endogenous variables. Exogenous inputs are fused via an embedding layer, eliminating the need to forecast them and enabling the model to attend to their short-term influences more effectively. We evaluate our approach on the LausanneCity dataset, which includes measurements from four urban drainage sensors, and demonstrate state-of-the-art performance when using only endogenous variables. Performance also improves with the inclusion of exogenous variables and forecast reports. To assess generalization and scalability, we additionally test the model on three large-scale synthesized datasets, generated from MeteoSwiss records, the Lorenz Attractors model, and the Random Fields model, each representing a different level of temporal complexity across 100 nodes. The results confirm that our model consistently outperforms existing baselines and maintains a robust and accurate forecast across both real and synthetic datasets.
This paper demonstrates the feasibility of trajectory learning for ensemble forecasts by employing the continuous ranked probability score (CRPS) as a loss function. Using the two-scale Lorenz '96 system as a case study, we develop and train both additive and multiplicative stochastic parametrizations to generate ensemble predictions. Results indicate that CRPS-based trajectory learning produces parametrizations that are both accurate and sharp. The resulting parametrizations are straightforward to calibrate and outperform derivative-fitting-based parametrizations in short-term forecasts. This approach is particularly promising for data assimilation applications due to its accuracy over short lead times.
This paper investigates the generalisability of Koopman-based representations for chaotic dynamical systems, focusing on their transferability across prediction and control tasks. Using the Lorenz system as a testbed, we propose a three-stage methodology: learning Koopman embeddings through autoencoding, pre-training a transformer on next-state prediction, and fine-tuning for safety-critical control. Our results show that Koopman embeddings outperform both standard and physics-informed PCA baselines, achieving accurate and data-efficient performance. Notably, fixing the pre-trained transformer weights during fine-tuning leads to no performance degradation, indicating that the learned representations capture reusable dynamical structure rather than task-specific patterns. These findings support the use of Koopman embeddings as a foundation for multi-task learning in physics-informed machine learning. A project page is available at https://kikisprdx.github.io/.




Forecasting chaotic dynamics beyond a few Lyapunov times is difficult because infinitesimal errors grow exponentially. Existing Echo State Networks (ESNs) mitigate this growth but employ reservoirs whose Euclidean geometry is mismatched to the stretch-and-fold structure of chaos. We introduce the Hyperbolic Embedding Reservoir (HypER), an ESN whose neurons are sampled in the Poincare ball and whose connections decay exponentially with hyperbolic distance. This negative-curvature construction embeds an exponential metric directly into the latent space, aligning the reservoir's local expansion-contraction spectrum with the system's Lyapunov directions while preserving standard ESN features such as sparsity, leaky integration, and spectral-radius control. Training is limited to a Tikhonov-regularized readout. On the chaotic Lorenz-63 and Roessler systems, and the hyperchaotic Chen-Ueta attractor, HypER consistently lengthens the mean valid-prediction horizon beyond Euclidean and graph-structured ESN baselines, with statistically significant gains confirmed over 30 independent runs; parallel results on real-world benchmarks, including heart-rate variability from the Santa Fe and MIT-BIH datasets and international sunspot numbers, corroborate its advantage. We further establish a lower bound on the rate of state divergence for HypER, mirroring Lyapunov growth.
State estimation from noisy observations is a fundamental problem in many applications of signal processing. Traditional methods, such as the extended Kalman filter, work well under fully-known Gaussian models, while recent hybrid deep learning frameworks, combining model-based and data-driven approaches, can also handle partially known models and non-Gaussian noise. However, existing studies commonly assume the absence of quantization distortion, which is inevitable, especially with non-ideal analog-to-digital converters. In this work, we consider a state estimation problem with 1-bit quantization. 1-bit quantization causes significant quantization distortion and severe information loss, rendering conventional state estimation strategies unsuitable. To address this, inspired by the Bussgang decomposition technique, we first develop the Bussgang-aided Kalman filter by assuming perfectly known models. The proposed method suitably captures quantization distortion into the state estimation process. In addition, we propose a computationally efficient variant, referred to as the reduced Bussgang-aided Kalman filter and, building upon it, introduce a deep learning-based approach for handling partially known models, termed the Bussgang-aided KalmanNet. In particular, the Bussgang-aided KalmanNet jointly uses a dithering technique and a gated recurrent unit (GRU) architecture to effectively mitigate the effects of 1-bit quantization and model mismatch. Through simulations on the Lorenz-Attractor model and the Michigan NCLT dataset, we demonstrate that our proposed methods achieve accurate state estimation performance even under highly nonlinear, mismatched models and 1-bit observations.




Neural populations exhibit latent dynamical structures that drive time-evolving spiking activities, motivating the search for models that capture both intrinsic network dynamics and external unobserved influences. In this work, we introduce LangevinFlow, a sequential Variational Auto-Encoder where the time evolution of latent variables is governed by the underdamped Langevin equation. Our approach incorporates physical priors -- such as inertia, damping, a learned potential function, and stochastic forces -- to represent both autonomous and non-autonomous processes in neural systems. Crucially, the potential function is parameterized as a network of locally coupled oscillators, biasing the model toward oscillatory and flow-like behaviors observed in biological neural populations. Our model features a recurrent encoder, a one-layer Transformer decoder, and Langevin dynamics in the latent space. Empirically, our method outperforms state-of-the-art baselines on synthetic neural populations generated by a Lorenz attractor, closely matching ground-truth firing rates. On the Neural Latents Benchmark (NLB), the model achieves superior held-out neuron likelihoods (bits per spike) and forward prediction accuracy across four challenging datasets. It also matches or surpasses alternative methods in decoding behavioral metrics such as hand velocity. Overall, this work introduces a flexible, physics-inspired, high-performing framework for modeling complex neural population dynamics and their unobserved influences.
Handling regime shifts and non-stationary time series in deep learning systems presents a significant challenge. In the case of online learning, when new information is introduced, it can disrupt previously stored data and alter the model's overall paradigm, especially with non-stationary data sources. Therefore, it is crucial for neural systems to quickly adapt to new paradigms while preserving essential past knowledge relevant to the overall problem. In this paper, we propose a novel training algorithm for neural networks called \textit{Lyapunov Learning}. This approach leverages the properties of nonlinear chaotic dynamical systems to prepare the model for potential regime shifts. Drawing inspiration from Stuart Kauffman's Adjacent Possible theory, we leverage local unexplored regions of the solution space to enable flexible adaptation. The neural network is designed to operate at the edge of chaos, where the maximum Lyapunov exponent, indicative of a system's sensitivity to small perturbations, evolves around zero over time. Our approach demonstrates effective and significant improvements in experiments involving regime shifts in non-stationary systems. In particular, we train a neural network to deal with an abrupt change in Lorenz's chaotic system parameters. The neural network equipped with Lyapunov learning significantly outperforms the regular training, increasing the loss ratio by about $96\%$.