Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Suhyun Ahn

Towards Brain MRI Foundation Models for the Clinic: Findings from the FOMO25 Challenge

Apr 13, 2026

Asbjørn Munk, Stefano Cerri, Vardan Nersesjan, Christian Hedeager Krag, Jakob Ambsdorf, Pablo Rocamora García, Julia Machnio, Peirong Liu, Suhyun Ahn, Nasrin Akbari(+74 more)

Abstract:Clinical deployment of automated brain MRI analysis faces a fundamental challenge: clinical data is heterogeneous and noisy, and high-quality labels are prohibitively costly to obtain. Self-supervised learning (SSL) can address this by leveraging the vast amounts of unlabeled data produced in clinical workflows to train robust \textit{foundation models} that adapt out-of-domain with minimal supervision. However, the development of foundation models for brain MRI has been limited by small pretraining datasets and in-domain benchmarking focused on high-quality, research-grade data. To address this gap, we organized the FOMO25 challenge as a satellite event at MICCAI 2025. FOMO25 provided participants with a large pretraining dataset, FOMO60K, and evaluated models on data sourced directly from clinical workflows in few-shot and out-of-domain settings. Tasks covered infarct classification, meningioma segmentation, and brain age regression, and considered both models trained on FOMO60K (method track) and any data (open track). Nineteen foundation models from sixteen teams were evaluated using a standardized containerized pipeline. Results show that (a) self-supervised pretraining improves generalization on clinical data under domain shift, with the strongest models trained \textit{out-of-domain} surpassing supervised baselines trained \textit{in-domain}. (b) No single pretraining objective benefits all tasks: MAE favors segmentation, hybrid reconstruction-contrastive objectives favor classification, and (c) strong performance was achieved by small pretrained models, and improvements from scaling model size and training duration did not yield reliable benefits.

Via

Access Paper or Ask Questions

Volumetric Conditioning Module to Control Pretrained Diffusion Models for 3D Medical Images

Oct 29, 2024

Suhyun Ahn, Wonjung Park, Jihoon Cho, Seunghyuck Park, Jinah Park

Figure 1 for Volumetric Conditioning Module to Control Pretrained Diffusion Models for 3D Medical Images

Figure 2 for Volumetric Conditioning Module to Control Pretrained Diffusion Models for 3D Medical Images

Figure 3 for Volumetric Conditioning Module to Control Pretrained Diffusion Models for 3D Medical Images

Figure 4 for Volumetric Conditioning Module to Control Pretrained Diffusion Models for 3D Medical Images

Abstract:Spatial control methods using additional modules on pretrained diffusion models have gained attention for enabling conditional generation in natural images. These methods guide the generation process with new conditions while leveraging the capabilities of large models. They could be beneficial as training strategies in the context of 3D medical imaging, where training a diffusion model from scratch is challenging due to high computational costs and data scarcity. However, the potential application of spatial control methods with additional modules to 3D medical images has not yet been explored. In this paper, we present a tailored spatial control method for 3D medical images with a novel lightweight module, Volumetric Conditioning Module (VCM). Our VCM employs an asymmetric U-Net architecture to effectively encode complex information from various levels of 3D conditions, providing detailed guidance in image synthesis. To examine the applicability of spatial control methods and the effectiveness of VCM for 3D medical data, we conduct experiments under single- and multimodal conditions scenarios across a wide range of dataset sizes, from extremely small datasets with 10 samples to large datasets with 500 samples. The experimental results show that the VCM is effective for conditional generation and efficient in terms of requiring less training data and computational resources. We further investigate the potential applications for our spatial control method through axial super-resolution for medical images. Our code is available at \url{https://github.com/Ahn-Ssu/VCM}

* 17 pages, 18 figures, accepted @ WACV 2025

Via

Access Paper or Ask Questions

Label-Efficient 3D Brain Segmentation via Complementary 2D Diffusion Models with Orthogonal Views

Jul 17, 2024

Jihoon Cho, Suhyun Ahn, Beomju Kim, Hyungjoon Bae, Xiaofeng Liu, Fangxu Xing, Kyungeun Lee, Georges Elfakhri, Van Wedeen, Jonghye Woo(+1 more)

Figure 1 for Label-Efficient 3D Brain Segmentation via Complementary 2D Diffusion Models with Orthogonal Views

Figure 2 for Label-Efficient 3D Brain Segmentation via Complementary 2D Diffusion Models with Orthogonal Views

Figure 3 for Label-Efficient 3D Brain Segmentation via Complementary 2D Diffusion Models with Orthogonal Views

Figure 4 for Label-Efficient 3D Brain Segmentation via Complementary 2D Diffusion Models with Orthogonal Views

Abstract:Deep learning-based segmentation techniques have shown remarkable performance in brain segmentation, yet their success hinges on the availability of extensive labeled training data. Acquiring such vast datasets, however, poses a significant challenge in many clinical applications. To address this issue, in this work, we propose a novel 3D brain segmentation approach using complementary 2D diffusion models. The core idea behind our approach is to first mine 2D features with semantic information extracted from the 2D diffusion models by taking orthogonal views as input, followed by fusing them into a 3D contextual feature representation. Then, we use these aggregated features to train multi-layer perceptrons to classify the segmentation labels. Our goal is to achieve reliable segmentation quality without requiring complete labels for each individual subject. Our experiments on training in brain subcortical structure segmentation with a dataset from only one subject demonstrate that our approach outperforms state-of-the-art self-supervised learning methods. Further experiments on the minimum requirement of annotation by sparse labeling yield promising results even with only nine slices and a labeled background region.

* Extended version of "3D Segmentation of Subcortical Brain Structure with Few Labeled Data using 2D Diffusion Models" (ISMRM 2024 oral)

Via

Access Paper or Ask Questions