Abstract:Clinical deployment of automated brain MRI analysis faces a fundamental challenge: clinical data is heterogeneous and noisy, and high-quality labels are prohibitively costly to obtain. Self-supervised learning (SSL) can address this by leveraging the vast amounts of unlabeled data produced in clinical workflows to train robust \textit{foundation models} that adapt out-of-domain with minimal supervision. However, the development of foundation models for brain MRI has been limited by small pretraining datasets and in-domain benchmarking focused on high-quality, research-grade data. To address this gap, we organized the FOMO25 challenge as a satellite event at MICCAI 2025. FOMO25 provided participants with a large pretraining dataset, FOMO60K, and evaluated models on data sourced directly from clinical workflows in few-shot and out-of-domain settings. Tasks covered infarct classification, meningioma segmentation, and brain age regression, and considered both models trained on FOMO60K (method track) and any data (open track). Nineteen foundation models from sixteen teams were evaluated using a standardized containerized pipeline. Results show that (a) self-supervised pretraining improves generalization on clinical data under domain shift, with the strongest models trained \textit{out-of-domain} surpassing supervised baselines trained \textit{in-domain}. (b) No single pretraining objective benefits all tasks: MAE favors segmentation, hybrid reconstruction-contrastive objectives favor classification, and (c) strong performance was achieved by small pretrained models, and improvements from scaling model size and training duration did not yield reliable benefits.
Abstract:The field of computer vision is undergoing a paradigm shift toward large-scale foundation model pre-training via self-supervised learning (SSL). Leveraging large volumes of unlabeled brain MRI data, such models can learn anatomical priors that improve few-shot performance in diverse neuroimaging tasks. However, most SSL frameworks are tailored to natural images, and their adaptation to capture multi-modal MRI information remains underexplored. This work proposes a modality-invariant representation learning setup and evaluates its effectiveness in stroke and epilepsy lesion segmentation, following large-scale pre-training. Experimental results suggest that despite successful cross-modality alignment, lesion segmentation primarily benefits from preserving fine-grained modality-specific features. Model checkpoints and code are made publicly available.
Abstract:Accurate prediction of epileptic seizures could prove critical for improving patient safety and quality of life in drug-resistant epilepsy. Although deep learning-based approaches have shown promising seizure prediction performance using scalp electroencephalogram (EEG) signals, substantial limitations still impede their clinical adoption. Furthermore, identifying the optimal preictal period (OPP) for labeling EEG segments remains a challenge. Here, we not only develop a competitive deep learning model for seizure prediction but, more importantly, leverage it to demonstrate a methodology to comprehensively evaluate the predictive performance in the seizure prediction task. For this, we introduce a CNN-Transformer deep learning model to detect preictal spatiotemporal dynamics, alongside a novel Continuous Input-Output Performance Ratio (CIOPR) metric to determine the OPP. We trained and evaluated our model on 19 pediatric patients of the open-access CHB-MIT dataset in a subject-specific manner. Using the OPP of each patient, preictal and interictal segments were correctly identified with an average sensitivity of 99.31%, specificity of 95.34%, AUC of 99.35%, and F1- score of 97.46%, while prediction time averaged 76.8 minutes before onset. Notably, our novel CIOPR metric allowed outlining the impact of different preictal period definitions on prediction time, accuracy, output stability, and transition time between interictal and preictal states in a comprehensive and quantitative way and highlighted the importance of considering both inter- and intra-patient variability in seizure prediction.