Abstract:Data assimilation (DA) estimates the state of an evolving dynamical system from noisy, partial observations, and is widely used in scientific simulation as well as weather and climate science. In practice, filtering methods rely on frame-to-frame transition models. However, these models are fragile when observations are non-Markovian (when they form only a partial slice of a higher-dimensional latent state as in real-world weather data): they tend to accumulate errors over long horizons. At the same time, learned DA methods typically commit to a single regime, either filtering (nowcasting, real-time forecasting) or smoothing (retrospective reanalysis), which splits what should be a shared prior across application-specific pipelines. To address both issues, we introduce ForcingDAS, a unified and robust DA framework. Built on Diffusion Forcing with an independent noise level assigned to each frame, ForcingDAS learns a joint-trajectory prior instead of frame-to-frame transitions. This allows it to capture long-horizon temporal dependencies and reduce error accumulation. In addition, the same trained model spans the full filtering to smoothing spectrum at inference time. Specifically, nowcasting, fixed-lag smoothing, and batch reanalysis are selected through the inference schedule alone, without retraining. We evaluate ForcingDAS on 2D Navier-Stokes vorticity, precipitation nowcasting, and global atmospheric state estimation. Across all settings, a single model is competitive with or outperforms both learned and classical baselines that are specialized for individual regimes, with the largest gains observed on real-world weather benchmarks.
Abstract:Principal component analysis (PCA) is a key tool in the field of data dimensionality reduction. Various methods have been proposed to extend PCA to the union of subspace (UoS) setting for clustering data that come from multiple subspaces like K-Subspaces (KSS). However, some applications involve heterogeneous data that vary in quality due to noise characteristics associated with each data sample. Heteroscedastic methods aim to deal with such mixed data quality. This paper develops a heteroscedastic-focused subspace clustering method, named ALPCAHUS, that can estimate the sample-wise noise variances and use this information to improve the estimate of the subspace bases associated with the low-rank structure of the data. This clustering algorithm builds on K-Subspaces (KSS) principles by extending the recently proposed heteroscedastic PCA method, named LR-ALPCAH, for clusters with heteroscedastic noise in the UoS setting. Simulations and real-data experiments show the effectiveness of accounting for data heteroscedasticity compared to existing clustering algorithms. Code available at https://github.com/javiersc1/ALPCAHUS.




Abstract:Purpose: Arterial Spin Labeling (ASL) is a quantitative, non-invasive alternative to perfusion imaging with contrast agents. Fixing values of certain model parameters in traditional ASL, which actually vary from region to region, may introduce bias in perfusion estimates. Adopting Magnetic Resonance Fingerprinting (MRF) for ASL is an alternative where these parameters are estimated alongside perfusion, but multiparametric estimation can degrade precision. We aim to improve the sensitivity of ASL-MRF signals to underlying parameters to counter this problem, and provide precise estimates. We also propose a regression based estimation framework for MRF-ASL. Methods: To improve the sensitivity of MRF-ASL signals to underlying parameters, we optimize ASL labeling durations using the Cramer-Rao Lower Bound (CRLB). This paper also proposes a neural network regression based estimation framework trained using noisy synthetic signals generated from our ASL signal model. Results: We test our methods in silico and in vivo, and compare with multiple post labeling delay (multi-PLD) ASL and unoptimized MRF-ASL. We present comparisons of estimated maps for six parameters accounted for in our signal model. Conclusions: The scan design process facilitates precise estimates of multiple hemodynamic parameters and tissue properties from a single scan, in regions of gray and white matter, as well as regions with anomalous perfusion activity in the brain. The regression based estimation approach provides perfusion estimates rapidly, and bypasses problems with quantization error. Keywords: Arterial Spin Labeling, Magnetic Resonance Fingerprinting, Optimization, Cramer-Rao Bound, Scan Design, Regression, Neural Networks, Deep Learning, Precision, Estimation, Brain Hemodynamics.