Abstract:Optimization over the space of probability measures endowed with the Wasserstein-2 geometry is central to modern machine learning and mean-field modeling. However, traditional methods relying on full Wasserstein gradients often suffer from high computational overhead in high-dimensional or ill-conditioned settings. We propose a randomized coordinate descent framework specifically designed for the Wasserstein manifold, introducing both Random Wasserstein Coordinate Descent (RWCD) and Random Wasserstein Coordinate Proximal{-Gradient} (RWCP) for composite objectives. By exploiting coordinate-wise structures, our methods adapt to anisotropic objective landscapes where full-gradient approaches typically struggle. We provide a rigorous convergence analysis across various landscape geometries, establishing guarantees under non-convex, Polyak-Łojasiewicz, and geodesically convex conditions. Our theoretical results mirror the classic convergence properties found in Euclidean space, revealing a compelling symmetry between coordinate descent on vectors and on probability measures. The developed techniques are inherently adaptive to the Wasserstein geometry and offer a robust analytical template that can be extended to other optimization solvers within the space of measures. Numerical experiments on ill-conditioned energies demonstrate that our framework offers significant speedups over conventional full-gradient methods.



Abstract:In this note, we examine the forward-Euler discretization for simulating Wasserstein gradient flows. We provide two counter-examples showcasing the failure of this discretization even for a simple case where the energy functional is defined as the KL divergence against some nicely structured probability densities. A simple explanation of this failure is also discussed.
Abstract:Does the use of auto-differentiation yield reasonable updates to deep neural networks that represent neural ODEs? Through mathematical analysis and numerical evidence, we find that when the neural network employs high-order forms to approximate the underlying ODE flows (such as the Linear Multistep Method (LMM)), brute-force computation using auto-differentiation often produces non-converging artificial oscillations. In the case of Leapfrog, we propose a straightforward post-processing technique that effectively eliminates these oscillations, rectifies the gradient computation and thus respects the updates of the underlying flow.