Abstract:This study presents a conditional flow matching framework for solving physics-constrained Bayesian inverse problems. In this setting, samples from the joint distribution of inferred variables and measurements are assumed available, while explicit evaluation of the prior and likelihood densities is not required. We derive a simple and self-contained formulation of both the unconditional and conditional flow matching algorithms, tailored specifically to inverse problems. In the conditional setting, a neural network is trained to learn the velocity field of a probability flow ordinary differential equation that transports samples from a chosen source distribution directly to the posterior distribution conditioned on observed measurements. This black-box formulation accommodates nonlinear, high-dimensional, and potentially non-differentiable forward models without restrictive assumptions on the noise model. We further analyze the behavior of the learned velocity field in the regime of finite training data. Under mild architectural assumptions, we show that overtraining can induce degenerate behavior in the generated conditional distributions, including variance collapse and a phenomenon termed selective memorization, wherein generated samples concentrate around training data points associated with similar observations. A simplified theoretical analysis explains this behavior, and numerical experiments confirm it in practice. We demonstrate that standard early-stopping criteria based on monitoring test loss effectively mitigate such degeneracy. The proposed method is evaluated on several physics-based inverse problems. We investigate the impact of different choices of source distributions, including Gaussian and data-informed priors. Across these examples, conditional flow matching accurately captures complex, multimodal posterior distributions while maintaining computational efficiency.
Abstract:We propose a data-driven method to learn the time-dependent probability density of a multivariate stochastic process from sample paths, assuming that the initial probability density is known and can be evaluated. Our method uses a novel time-dependent binary classifier trained using a contrastive estimation-based objective that trains the classifier to discriminate between realizations of the stochastic process at two nearby time instants. Significantly, the proposed method explicitly models the time-dependent probability distribution, which means that it is possible to obtain the value of the probability density within the time horizon of interest. Additionally, the input before the final activation in the time-dependent classifier is a second-order approximation to the partial derivative, with respect to time, of the logarithm of the density. We apply the proposed approach to approximate the time-dependent probability density functions for systems driven by stochastic excitations. We also use the proposed approach to synthesize new samples of a random vector from a given set of its realizations. In such applications, we generate sample paths necessary for training using stochastic interpolants. Subsequently, new samples are generated using gradient-based Markov chain Monte Carlo methods because automatic differentiation can efficiently provide the necessary gradient. Further, we demonstrate the utility of an explicit approximation to the time-dependent probability density function through applications in unsupervised outlier detection. Through several numerical experiments, we show that the proposed method accurately reconstructs complex time-dependent, multi-modal, and near-degenerate densities, scales effectively to moderately high-dimensional problems, and reliably detects rare events among real-world data.
Abstract:Diffusion models have emerged as powerful generative tools with applications in computer vision and scientific machine learning (SciML), where they have been used to solve large-scale probabilistic inverse problems. Traditionally, these models have been derived using principles of variational inference, denoising, statistical signal processing, and stochastic differential equations. In contrast to the conventional presentation, in this study we derive diffusion models using ideas from linear partial differential equations and demonstrate that this approach has several benefits that include a constructive derivation of the forward and reverse processes, a unified derivation of multiple formulations and sampling strategies, and the discovery of a new class of models. We also apply the conditional version of these models to solving canonical conditional density estimation problems and challenging inverse problems. These problems help establish benchmarks for systematically quantifying the performance of different formulations and sampling strategies in this study, and for future studies. Finally, we identify and implement a mechanism through which a single diffusion model can be applied to measurements obtained from multiple measurement operators. Taken together, the contents of this manuscript provide a new understanding and several new directions in the application of diffusion models to solving physics-based inverse problems.