Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gonzalo Iñaki Quintana

MONET: A Massive, Open, Non-redundant and Enriched Text-to-image dataset

May 20, 2026

Benjamin Aubin, Gonzalo Iñaki Quintana, Onur Tasar, Sanjeev Sreetharan, Urszula Czerwinska, Damien Henry, Clément Chadebec

Abstract:Training large text-to-image models requires high-quality, curated datasets with diverse content and detailed captions. Yet the cost and complexity of collecting, filtering, deduplicating, and re-captioning such corpora at scale hinders open and reproducible research in the field. We introduce MONET, an open Apache 2.0 dataset of approx. 104.9M image--text pairs collected from 2.9B raw pairs across heterogeneous open sources through successive stages of safety filtering, domain-based filtering, exact and near-duplicate removal, and re-captioning with multiple vision-language models covering short to long-form descriptions, and further augmented with synthetically generated samples. Each image is shipped with pre-computed embeddings and annotations to accelerate downstream use. To validate the effectiveness of MONET, we train a 4B-parameter latent diffusion model exclusively on it and reach competitive GenEval and DPG scores, demonstrating that our dataset lowers the barrier to large-scale, reproducible text-to-image research.

Via

Access Paper or Ask Questions

Evaluating Federated Learning approaches for mammography under breast density heterogeneity

May 09, 2026

Gonzalo Iñaki Quintana, Franco Martin Di Maria, Laurence Vancamberg

Abstract:Breast density is a key factor that influences mammography interpretation and is a major source of heterogeneity in multicenter datasets. Such heterogeneity poses challenges for collaborative machine learning across institutions, particularly in Federated Learning. This study aims to evaluate the impact of breast density-induced heterogeneity on FL for mammography image classification and to assess the robustness of common FL algorithms in realistic clinical settings. We conducted experiments under two scenarios: (1) a strongly heterogeneous setting where each participating site contributed exclusively low- or high-density cases, based on the BI-RADS density score, and (2) a population-based setting simulating breast density distributions in White and Asian populations. For the strongly heterogeneous setting, we evaluated two configurations: one with 2 clients, where the cases were grouped as BI-RADS A-B and C-D, and one with 4 clients, where each site contained cases of a single BI-RADS density. We compared three FL methods (FedAvg, FedProx, SCAFFOLD) against centralized training, local-only training, and naive aggregation approaches, including ensembling and weight averaging. Across both scenarios, FL achieved performance comparable to centralized training, while local models and naive aggregation approaches underperformed in the presence of strong heterogeneity. Notably, FedAvg achieved accuracy on par with or exceeding centralized training, demonstrating resilience to breast density-induced data imbalance without requiring specialized heterogeneity mitigation algorithms. These findings show that FL can address breast density-related heterogeneity, supporting its feasibility for real-world mammography workflows. The demonstrated robustness of FedAvg underscores the potential for broad clinical deployment of FL, enabling collaborative model development while maintaining data privacy.

Via

Access Paper or Ask Questions

Fast, faithful and photorealistic diffusion-based image super-resolution with enhanced Flow Map models

Jan 23, 2026

Maxence Noble, Gonzalo Iñaki Quintana, Benjamin Aubin, Clément Chadebec

Abstract:Diffusion-based image super-resolution (SR) has recently attracted significant attention by leveraging the expressive power of large pre-trained text-to-image diffusion models (DMs). A central practical challenge is resolving the trade-off between reconstruction faithfulness and photorealism. To address inference efficiency, many recent works have explored knowledge distillation strategies specifically tailored to SR, enabling one-step diffusion-based approaches. However, these teacher-student formulations are inherently constrained by information compression, which can degrade perceptual cues such as lifelike textures and depth of field, even with high overall perceptual quality. In parallel, self-distillation DMs, known as Flow Map models, have emerged as a promising alternative for image generation tasks, enabling fast inference while preserving the expressivity and training stability of standard DMs. Building on these developments, we propose FlowMapSR, a novel diffusion-based framework for image super-resolution explicitly designed for efficient inference. Beyond adapting Flow Map models to SR, we introduce two complementary enhancements: (i) positive-negative prompting guidance, based on a generalization of classifier free-guidance paradigm to Flow Map models, and (ii) adversarial fine-tuning using Low-Rank Adaptation (LoRA). Among the considered Flow Map formulations (Eulerian, Lagrangian, and Shortcut), we find that the Shortcut variant consistently achieves the best performance when combined with these enhancements. Extensive experiments show that FlowMapSR achieves a better balance between reconstruction faithfulness and photorealism than recent state-of-the-art methods for both x4 and x8 upscaling, while maintaining competitive inference time. Notably, a single model is used for both upscaling factors, without any scale-specific conditioning or degradation-guided mechanisms.

* Technical report

Via

Access Paper or Ask Questions

BN-SCAFFOLD: controlling the drift of Batch Normalization statistics in Federated Learning

Oct 04, 2024

Gonzalo Iñaki Quintana, Laurence Vancamberg, Vincent Jugnon, Mathilde Mougeot, Agnès Desolneux

Figure 1 for BN-SCAFFOLD: controlling the drift of Batch Normalization statistics in Federated Learning

Figure 2 for BN-SCAFFOLD: controlling the drift of Batch Normalization statistics in Federated Learning

Figure 3 for BN-SCAFFOLD: controlling the drift of Batch Normalization statistics in Federated Learning

Figure 4 for BN-SCAFFOLD: controlling the drift of Batch Normalization statistics in Federated Learning

Abstract:Federated Learning (FL) is gaining traction as a learning paradigm for training Machine Learning (ML) models in a decentralized way. Batch Normalization (BN) is ubiquitous in Deep Neural Networks (DNN), as it improves convergence and generalization. However, BN has been reported to hinder performance of DNNs in heterogeneous FL. Recently, the FedTAN algorithm has been proposed to mitigate the effect of heterogeneity on BN, by aggregating BN statistics and gradients from all the clients. However, it has a high communication cost, that increases linearly with the depth of the DNN. SCAFFOLD is a variance reduction algorithm, that estimates and corrects the client drift in a communication-efficient manner. Despite its promising results in heterogeneous FL settings, it has been reported to underperform for models with BN. In this work, we seek to revive SCAFFOLD, and more generally variance reduction, as an efficient way of training DNN with BN in heterogeneous FL. We introduce a unified theoretical framework for analyzing the convergence of variance reduction algorithms in the BN-DNN setting, inspired of by the work of Wang et al. 2023, and show that SCAFFOLD is unable to remove the bias introduced by BN. We thus propose the BN-SCAFFOLD algorithm, which extends the client drift correction of SCAFFOLD to BN statistics. We prove convergence using the aforementioned framework and validate the theoretical results with experiments on MNIST and CIFAR-10. BN-SCAFFOLD equals the performance of FedTAN, without its high communication cost, outperforming Federated Averaging (FedAvg), SCAFFOLD, and other FL algorithms designed to mitigate BN heterogeneity.

Via

Access Paper or Ask Questions