Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Felix X. -F. Ye

Diffuse Everything: Multimodal Diffusion Models on Arbitrary State Spaces

Jun 09, 2025

Kevin Rojas, Yuchen Zhu, Sichen Zhu, Felix X. -F. Ye, Molei Tao

Abstract:Diffusion models have demonstrated remarkable performance in generating unimodal data across various tasks, including image, video, and text generation. On the contrary, the joint generation of multimodal data through diffusion models is still in the early stages of exploration. Existing approaches heavily rely on external preprocessing protocols, such as tokenizers and variational autoencoders, to harmonize varied data representations into a unified, unimodal format. This process heavily demands the high accuracy of encoders and decoders, which can be problematic for applications with limited data. To lift this restriction, we propose a novel framework for building multimodal diffusion models on arbitrary state spaces, enabling native generation of coupled data across different modalities. By introducing an innovative decoupled noise schedule for each modality, we enable both unconditional and modality-conditioned generation within a single model simultaneously. We empirically validate our approach for text-image generation and mixed-type tabular data synthesis, demonstrating that it achieves competitive performance.

* Accepted to ICML 2025. Code available at https://github.com/KevinRojas1499/Diffuse-Everything

Via

Access Paper or Ask Questions

Nonlinear model reduction for slow-fast stochastic systems near manifolds

Apr 05, 2021

Felix X. -F. Ye, Sichen Yang, Mauro Maggioni

Figure 1 for Nonlinear model reduction for slow-fast stochastic systems near manifolds

Figure 2 for Nonlinear model reduction for slow-fast stochastic systems near manifolds

Figure 3 for Nonlinear model reduction for slow-fast stochastic systems near manifolds

Figure 4 for Nonlinear model reduction for slow-fast stochastic systems near manifolds

Abstract:We introduce a nonlinear stochastic model reduction technique for high-dimensional stochastic dynamical systems that have a low-dimensional invariant effective manifold with slow dynamics, and high-dimensional, large fast modes. Given only access to a black box simulator from which short bursts of simulation can be obtained, we estimate the invariant manifold, a process of the effective (stochastic) dynamics on it, and construct an efficient simulator thereof. These estimation steps can be performed on-the-fly, leading to efficient exploration of the effective state space, without losing consistency with the underlying dynamics. This construction enables fast and efficient simulation of paths of the effective dynamics, together with estimation of crucial features and observables of such dynamics, including the stationary distribution, identification of metastable states, and residence times and transition rates between them.

* 45 pages, 9 figures, 5 tables

Via

Access Paper or Ask Questions

ISALT: Inference-based schemes adaptive to large time-stepping for locally Lipschitz ergodic systems

Feb 25, 2021

Xingjie Li, Fei Lu, Felix X. -F. Ye

Figure 1 for ISALT: Inference-based schemes adaptive to large time-stepping for locally Lipschitz ergodic systems

Figure 2 for ISALT: Inference-based schemes adaptive to large time-stepping for locally Lipschitz ergodic systems

Figure 3 for ISALT: Inference-based schemes adaptive to large time-stepping for locally Lipschitz ergodic systems

Figure 4 for ISALT: Inference-based schemes adaptive to large time-stepping for locally Lipschitz ergodic systems

Abstract:Efficient simulation of SDEs is essential in many applications, particularly for ergodic systems that demand efficient simulation of both short-time dynamics and large-time statistics. However, locally Lipschitz SDEs often require special treatments such as implicit schemes with small time-steps to accurately simulate the ergodic measure. We introduce a framework to construct inference-based schemes adaptive to large time-steps (ISALT) from data, achieving a reduction in time by several orders of magnitudes. The key is the statistical learning of an approximation to the infinite-dimensional discrete-time flow map. We explore the use of numerical schemes (such as the Euler-Maruyama, a hybrid RK4, and an implicit scheme) to derive informed basis functions, leading to a parameter inference problem. We introduce a scalable algorithm to estimate the parameters by least squares, and we prove the convergence of the estimators as data size increases. We test the ISALT on three non-globally Lipschitz SDEs: the 1D double-well potential, a 2D multi-scale gradient system, and the 3D stochastic Lorenz equation with degenerate noise. Numerical results show that ISALT can tolerate time-step magnitudes larger than plain numerical schemes. It reaches optimal accuracy in reproducing the invariant measure when the time-step is medium-large.

* 20 pages, 9 figures

Via

Access Paper or Ask Questions

Estimate exponential memory decay in Hidden Markov Model and its applications

Oct 17, 2017

Felix X. -F. Ye, Yi-an Ma, Hong Qian

Figure 1 for Estimate exponential memory decay in Hidden Markov Model and its applications

Figure 2 for Estimate exponential memory decay in Hidden Markov Model and its applications

Figure 3 for Estimate exponential memory decay in Hidden Markov Model and its applications

Figure 4 for Estimate exponential memory decay in Hidden Markov Model and its applications

Abstract:Inference in hidden Markov model has been challenging in terms of scalability due to dependencies in the observation data. In this paper, we utilize the inherent memory decay in hidden Markov models, such that the forward and backward probabilities can be carried out with subsequences, enabling efficient inference over long sequences of observations. We formulate this forward filtering process in the setting of the random dynamical system and there exist Lyapunov exponents in the i.i.d random matrices production. And the rate of the memory decay is known as $\lambda_2-\lambda_1$, the gap of the top two Lyapunov exponents almost surely. An efficient and accurate algorithm is proposed to numerically estimate the gap after the soft-max parametrization. The length of subsequences $B$ given the controlled error $\epsilon$ is $B=\log(\epsilon)/(\lambda_2-\lambda_1)$. We theoretically prove the validity of the algorithm and demonstrate the effectiveness with numerical examples. The method developed here can be applied to widely used algorithms, such as mini-batch stochastic gradient method. Moreover, the continuity of Lyapunov spectrum ensures the estimated $B$ could be reused for the nearby parameter during the inference.

Via

Access Paper or Ask Questions