Abstract:In this paper, we propose a flow-based method for learning all-to-all transfer maps among conditional distributions, approximating pairwise optimal transport. The proposed method addresses the challenge of handling continuous conditions, which often involve a large set of conditions with sparse empirical observations per condition. We introduce a novel cost function that enables simultaneous learning of optimal transports for all pairs of conditional distributions. Our method is supported by a theoretical guarantee that, in the limit, it converges to pairwise optimal transports among infinite pairs of conditional distributions. The learned transport maps are subsequently used to couple data points in conditional flow matching. We demonstrate the effectiveness of this method on synthetic and benchmark datasets, as well as on chemical datasets where continuous physical properties are defined as conditions.
Abstract:We discuss a connection between a generative model, called the diffusion model, and nonequilibrium thermodynamics for the Fokker-Planck equation, called stochastic thermodynamics. Based on the techniques of stochastic thermodynamics, we derive the speed-accuracy trade-off for the diffusion models, which is a trade-off relationship between the speed and accuracy of data generation in diffusion models. Our result implies that the entropy production rate in the forward process affects the errors in data generation. From a stochastic thermodynamic perspective, our results provide quantitative insight into how best to generate data in diffusion models. The optimal learning protocol is introduced by the conservative force in stochastic thermodynamics and the geodesic of space by the 2-Wasserstein distance in optimal transport theory. We numerically illustrate the validity of the speed-accuracy trade-off for the diffusion models with different noise schedules such as the cosine schedule, the conditional optimal transport, and the optimal transport.