In utero fetal brain magnetic resonance images are inherently limited in spatial resolution due to stochastic motion of the fetus. Super-resolution reconstruction methods have become the go-to approach to compute an isotropic motion-free volume of the fetal brain from low-resolution series of 2D thick slices. Such pipelines often rely on an optimization problem with a data fidelity and a regularization term, balanced by a hyperparameter $\alpha$. The lack of ground truth images makes it difficult to adapt $\alpha$ to a given setting of interest in a quantitative manner. In this work, we propose a simulation-based approach to tune $\alpha$ for a given acquisition setting. We focus on two key aspects: the magnetic field strength (1.5T and 3T) and number of LR series used for reconstruction. Our results show that the optimal $\alpha$ significantly improves the performance compared to the default value, across two commonly used SR pipelines. Qualitative validation on clinical data confirms the importance of tuning this parameter to the setting of interest.
Superresolution T2-weighted fetal-brain magnetic-resonance imaging (FBMRI) traditionally relies on the availability of several orthogonal low-resolution series of 2-dimensional thick slices (volumes). In practice, only a few low-resolution volumes are acquired. Thus, optimization-based image-reconstruction methods require strong regularization using hand-crafted regularizers (e.g., TV). Yet, due to in utero fetal motion and the rapidly changing fetal brain anatomy, the acquisition of the high-resolution images that are required to train supervised learning methods is difficult. In this paper, we sidestep this difficulty by providing a proof of concept of a self-supervised single-volume superresolution framework for T2-weighted FBMRI (SAIR). We validate SAIR quantitatively in a motion-free simulated environment. Our results for different noise levels and resolution ratios suggest that SAIR is comparable to multiple-volume superresolution reconstruction methods. We also evaluate SAIR qualitatively on clinical FBMRI data. The results suggest SAIR could be incorporated into current reconstruction pipelines.
This paper focuses on the uncertainty estimation for white matter lesions (WML) segmentation in magnetic resonance imaging (MRI). On one side, voxel-scale segmentation errors cause the erroneous delineation of the lesions; on the other side, lesion-scale detection errors lead to wrong lesion counts. Both of these factors are clinically relevant for the assessment of multiple sclerosis patients. This work aims to compare the ability of different voxel- and lesion-scale uncertainty measures to capture errors related to segmentation and lesion detection, respectively. Our main contributions are (i) proposing new measures of lesion-scale uncertainty that do not utilise voxel-scale uncertainties; (ii) extending an error retention curves analysis framework for evaluation of lesion-scale uncertainty measures. Our results obtained on the multi-center testing set of 58 patients demonstrate that the proposed lesion-scale measure achieves the best performance among the analysed measures. All code implementations are provided at https://github.com/NataliiaMolch/MS_WML_uncs
Creating large annotated datasets represents a major bottleneck for the development of deep learning models in radiology. To overcome this, we propose a combined use of weak labels (imprecise, but fast-to-create annotations) and Transfer Learning (TL). Specifically, we explore inductive TL, where source and target domains are identical, but tasks are different due to a label shift: our target labels are created manually by three radiologists, whereas the source weak labels are generated automatically from textual radiology reports. We frame knowledge transfer as hyperparameter optimization, thus avoiding heuristic choices that are frequent in related works. We investigate the relationship between model size and TL, comparing a low-capacity VGG with a higher-capacity SEResNeXt. The task that we address is change detection in follow-up glioma imaging: we extracted 1693 T2-weighted magnetic resonance imaging difference maps from 183 patients, and classified them into stable or unstable according to tumor evolution. Weak labeling allowed us to increase dataset size more than 3-fold, and improve VGG classification results from 75% to 82% Area Under the ROC Curve (AUC) (p=0.04). Mixed training from scratch led to higher performance than fine-tuning or feature extraction. To assess generalizability, we also ran inference on an open dataset (BraTS-2015: 15 patients, 51 difference maps), reaching up to 76% AUC. Overall, results suggest that medical imaging problems may benefit from smaller models and different TL strategies with respect to computer vision problems, and that report-generated weak labels are effective in improving model performances. Code, in-house dataset and BraTS labels are released.
Resting-state functional Magnetic Resonance Imaging (fMRI) is a powerful imaging technique for studying functional development of the brain in utero. However, unpredictable and excessive movement of fetuses have limited its clinical applicability. Previous studies have focused primarily on the accurate estimation of the motion parameters employing a single step 3D interpolation at each individual time frame to recover a motion-free 4D fMRI image. Using only information from a 3D spatial neighborhood neglects the temporal structure of fMRI and useful information from neighboring timepoints. Here, we propose a novel technique based on four dimensional iterative reconstruction of the motion scattered fMRI slices. Quantitative evaluation of the proposed method on a cohort of real clinical fetal fMRI data indicates improvement of reconstruction quality compared to the conventional 3D interpolation approaches.
Diffusion MRI (dMRI) of the developing brain can provide valuable insights into the white matter development. However, slice thickness in fetal dMRI is typically high (i.e., 3-5 mm) to freeze the in-plane motion, which reduces the sensitivity of the dMRI signal to the underlying anatomy. In this study, we aim at overcoming this problem by using autoencoders to learn unsupervised efficient representations of brain slices in a latent space, using raw dMRI signals and their spherical harmonics (SH) representation. We first learn and quantitatively validate the autoencoders on the developing Human Connectome Project pre-term newborn data, and further test the method on fetal data. Our results show that the autoencoder in the signal domain better synthesized the raw signal. Interestingly, the fractional anisotropy and, to a lesser extent, the mean diffusivity, are best recovered in missing slices by using the autoencoder trained with SH coefficients. A comparison was performed with the same maps reconstructed using an autoencoder trained with raw signals, as well as conventional interpolation methods of raw signals and SH coefficients. From these results, we conclude that the recovery of missing/corrupted slices should be performed in the signal domain if the raw signal is aimed to be recovered, and in the SH domain if diffusion tensor properties (i.e., fractional anisotropy) are targeted. Notably, the trained autoencoders were able to generalize to fetal dMRI data acquired using a much smaller number of diffusion gradients and a lower b-value, where we qualitatively show the consistency of the estimated diffusion tensor maps.
The fetal cortical plate (CP) undergoes drastic morphological changes during the in utero development. Therefore, CP growth and folding patterns are key indicator in the assessment of the brain development and maturation. Magnetic resonance imaging (MRI) offers specific insights for the analysis of quantitative imaging biomarkers. Nonetheless, accurate and, more importantly, topologically correct MR image segmentation remains the key baseline to such analysis. In this study, we propose a deep learning segmentation framework for automatic and morphologically consistent segmentation of the CP in fetal brain MRI. Our contribution is two fold. First, we generalized a multi-dimensional topological loss function in order to enhance the topological accuracy. Second, we introduced hole ratio, a new topology-based validation measure that quantifies the size of the topological defects taking into account the size of the structure of interest. Using two publicly available datasets, we quantitatively evaluated our proposed method based on three complementary metrics which are overlap-, distance- and topology-based on 27 fetal brains. Our results evidence that our topology-integrative framework outperforms state-of-the-art training loss functions on super-resolution reconstructed clinical MRI, not only in shape correctness but also in the classical evaluation metrics. Furthermore, results on additional 31 out-of-domain SR reconstructions from clinical acquisitions were qualitatively assessed by three experts. The experts' consensus ranked our TopoCP method as the best segmentation in 100\% of the cases with a high inter-expert agreement. Overall, both quantitative and qualitative results, on a wide range of gestational ages and number of cases, support the generalizability and added value of our topology-guided framework for fetal CP segmentation.
Distributional shift, or the mismatch between training and deployment data, is a significant obstacle to the usage of machine learning in high-stakes industrial applications, such as autonomous driving and medicine. This creates a need to be able to assess how robustly ML models generalize as well as the quality of their uncertainty estimates. Standard ML baseline datasets do not allow these properties to be assessed, as the training, validation and test data are often identically distributed. Recently, a range of dedicated benchmarks have appeared, featuring both distributionally matched and shifted data. Among these benchmarks, the Shifts dataset stands out in terms of the diversity of tasks as well as the data modalities it features. While most of the benchmarks are heavily dominated by 2D image classification tasks, Shifts contains tabular weather forecasting, machine translation, and vehicle motion prediction tasks. This enables the robustness properties of models to be assessed on a diverse set of industrial-scale tasks and either universal or directly applicable task-specific conclusions to be reached. In this paper, we extend the Shifts Dataset with two datasets sourced from industrial, high-risk applications of high societal importance. Specifically, we consider the tasks of segmentation of white matter Multiple Sclerosis lesions in 3D magnetic resonance brain images and the estimation of power consumption in marine cargo vessels. Both tasks feature ubiquitous distributional shifts and a strict safety requirement due to the high cost of errors. These new datasets will allow researchers to further explore robust generalization and uncertainty estimation in new situations. In this work, we provide a description of the dataset and baseline results for both tasks.
In-utero fetal MRI is emerging as an important tool in the diagnosis and analysis of the developing human brain. Automatic segmentation of the developing fetal brain is a vital step in the quantitative analysis of prenatal neurodevelopment both in the research and clinical context. However, manual segmentation of cerebral structures is time-consuming and prone to error and inter-observer variability. Therefore, we organized the Fetal Tissue Annotation (FeTA) Challenge in 2021 in order to encourage the development of automatic segmentation algorithms on an international level. The challenge utilized FeTA Dataset, an open dataset of fetal brain MRI reconstructions segmented into seven different tissues (external cerebrospinal fluid, grey matter, white matter, ventricles, cerebellum, brainstem, deep grey matter). 20 international teams participated in this challenge, submitting a total of 21 algorithms for evaluation. In this paper, we provide a detailed analysis of the results from both a technical and clinical perspective. All participants relied on deep learning methods, mainly U-Nets, with some variability present in the network architecture, optimization, and image pre- and post-processing. The majority of teams used existing medical imaging deep learning frameworks. The main differences between the submissions were the fine tuning done during training, and the specific pre- and post-processing steps performed. The challenge results showed that almost all submissions performed similarly. Four of the top five teams used ensemble learning methods. However, one team's algorithm performed significantly superior to the other submissions, and consisted of an asymmetrical U-Net network architecture. This paper provides a first of its kind benchmark for future automatic multi-tissue segmentation algorithms for the developing human brain in utero.