Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Amadou S. Sangare

Improving Controllable Generation: Faster Training and Better Performance via $x_0$-Supervision

Apr 07, 2026

Amadou S. Sangare, Adrien Maglo, Mohamed Chaouch, Bertrand Luvison

Abstract:Text-to-Image (T2I) diffusion/flow models have recently achieved remarkable progress in visual fidelity and text alignment. However, they remain limited when users need to precisely control image layouts, something that natural language alone cannot reliably express. Controllable generation methods augment the initial T2I model with additional conditions that more easily describe the scene. Prior works straightforwardly train the augmented network with the same loss as the initial network. Although natural at first glance, this can lead to very long training times in some cases before convergence. In this work, we revisit the training objective of controllable diffusion models through a detailed analysis of their denoising dynamics. We show that direct supervision on the clean target image, dubbed $x_0$-supervision, or an equivalent re-weighting of the diffusion loss, yields faster convergence. Experiments on multiple control settings demonstrate that our formulation accelerates convergence by up to 2$\times$ according to our novel metric (mean Area Under the Convergence Curve - mAUCC), while also improving both visual quality and conditioning accuracy. Our code is available at https://github.com/CEA-LIST/x0-supervision

Via

Access Paper or Ask Questions

A Fused Gromov-Wasserstein Approach to Subgraph Contrastive Learning

Feb 28, 2025

Amadou S. Sangare, Nicolas Dunou, Jhony H. Giraldo, Fragkiskos D. Malliaros

Figure 1 for A Fused Gromov-Wasserstein Approach to Subgraph Contrastive Learning

Figure 2 for A Fused Gromov-Wasserstein Approach to Subgraph Contrastive Learning

Figure 3 for A Fused Gromov-Wasserstein Approach to Subgraph Contrastive Learning

Figure 4 for A Fused Gromov-Wasserstein Approach to Subgraph Contrastive Learning

Abstract:Self-supervised learning has become a key method for training deep learning models when labeled data is scarce or unavailable. While graph machine learning holds great promise across various domains, the design of effective pretext tasks for self-supervised graph representation learning remains challenging. Contrastive learning, a popular approach in graph self-supervised learning, leverages positive and negative pairs to compute a contrastive loss function. However, current graph contrastive learning methods often struggle to fully use structural patterns and node similarities. To address these issues, we present a new method called Fused Gromov Wasserstein Subgraph Contrastive Learning (FOSSIL). Our model integrates node-level and subgraph-level contrastive learning, seamlessly combining a standard node-level contrastive loss with the Fused Gromov-Wasserstein distance. This combination helps our method capture both node features and graph structure together. Importantly, our approach works well with both homophilic and heterophilic graphs and can dynamically create views for generating positive and negative pairs. Through extensive experiments on benchmark graph datasets, we show that FOSSIL outperforms or achieves competitive performance compared to current state-of-the-art methods.

* Transactions on Machine Learning Research, 2025

Via

Access Paper or Ask Questions