Centre for Medical Image Computing, Department of Computer Science, University College London, UK, Martinos Center for Biomedical Imaging, Massachusetts General Hospital and Harvard Medical School, Boston, USA, Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Boston, USA
In recent years, learning-based image registration methods have gradually moved away from direct supervision with target warps to instead use self-supervision, with excellent results in several registration benchmarks. These approaches utilize a loss function that penalizes the intensity differences between the fixed and moving images, along with a suitable regularizer on the deformation. In this paper, we argue that the relative failure of supervised registration approaches can in part be blamed on the use of regular U-Nets, which are jointly tasked with feature extraction, feature matching, and estimation of deformation. We introduce one simple but crucial modification to the U-Net that disentangles feature extraction and matching from deformation prediction, allowing the U-Net to warp the features, across levels, as the deformation field is evolved. With this modification, direct supervision using target warps begins to outperform self-supervision approaches that require segmentations, presenting new directions for registration when images do not have segmentations. We hope that our findings in this preliminary workshop paper will re-ignite research interest in supervised image registration techniques. Our code is publicly available from https://github.com/balbasty/superwarp.
The recent introduction of portable, low-field MRI (LF-MRI) into the clinical setting has the potential to transform neuroimaging. However, LF-MRI is limited by lower resolution and signal-to-noise ratio, leading to incomplete characterization of brain regions. To address this challenge, recent advances in machine learning facilitate the synthesis of higher resolution images derived from one or multiple lower resolution scans. Here, we report the extension of a machine learning super-resolution (SR) algorithm to synthesize 1 mm isotropic MPRAGE-like scans from LF-MRI T1-weighted and T2-weighted sequences. Our initial results on a paired dataset of LF and high-field (HF, 1.5T-3T) clinical scans show that: (i) application of available automated segmentation tools directly to LF-MRI images falters; but (ii) segmentation tools succeed when applied to SR images with high correlation to gold standard measurements from HF-MRI (e.g., r = 0.85 for hippocampal volume, r = 0.84 for the thalamus, r = 0.92 for the whole cerebrum). This work demonstrates proof-of-principle post-processing image enhancement from lower resolution LF-MRI sequences. These results lay the foundation for future work to enhance the detection of normal and abnormal image findings at LF and ultimately improve the diagnostic performance of LF-MRI. Our tools are publicly available on FreeSurfer (surfer.nmr.mgh.harvard.edu/).
Training a fully convolutional network for semantic segmentation typically requires a large, labeled dataset with little label noise if good generalization is to be guaranteed. For many segmentation problems, however, data with pixel- or voxel-level labeling accuracy are scarce due to the cost of manual labeling. This problem is exacerbated in domains where manual annotation is difficult, resulting in large amounts of variability in the labeling even across domain experts. Therefore, training segmentation networks to generalize better by learning from both labeled and unlabeled images (called semi-supervised learning) is problem of both practical and theoretical interest. However, traditional semi-supervised learning methods for segmentation often necessitate hand-crafting a differentiable regularizer specific to a given segmentation problem, which can be extremely time-consuming. In this work, we propose "supervision by denoising" (SUD), a framework that enables us to supervise segmentation models using their denoised output as targets. SUD unifies temporal ensembling and spatial denoising techniques under a spatio-temporal denoising framework and alternates denoising and network weight update in an optimization framework for semi-supervision. We validate SUD on three tasks-kidney and tumor (3D), and brain (3D) segmentation, and cortical parcellation (2D)-demonstrating a significant improvement in the Dice overlap and the Hausdorff distance of segmentations over supervised-only and temporal ensemble baselines.
Nonlinear inter-modality registration is often challenging due to the lack of objective functions that are good proxies for alignment. Here we propose a synthesis-by-registration method to convert this problem into an easier intra-modality task. We introduce a registration loss for weakly supervised image translation between domains that does not require perfectly aligned training data. This loss capitalises on a registration U-Net with frozen weights, to drive a synthesis CNN towards the desired translation. We complement this loss with a structure preserving constraint based on contrastive learning, which prevents blurring and content shifts due to overfitting. We apply this method to the registration of histological sections to MRI slices, a key step in 3D histology reconstruction. Results on two different public datasets show improvements over registration based on mutual information (13% reduction in landmark error) and synthesis-based algorithms such as CycleGAN (11% reduction), and are comparable to a registration CNN with label supervision.
Despite advances in data augmentation and transfer learning, convolutional neural networks (CNNs) have difficulties generalising to unseen target domains. When applied to segmentation of brain MRI scans, CNNs are highly sensitive to changes in resolution and contrast: even within the same MR modality, decreases in performance can be observed across datasets. We introduce SynthSeg, the first segmentation CNN agnostic to brain MRI scans of any contrast and resolution. SynthSeg is trained with synthetic data sampled from a generative model inspired by Bayesian segmentation. Crucially, we adopt a \textit{domain randomisation} strategy where we fully randomise the generation parameters to maximise the variability of the training data. Consequently, SynthSeg can segment preprocessed and unpreprocessed real scans of any target domain, without retraining or fine-tuning. Because SynthSeg only requires segmentations to be trained (no images), it can learn from label maps obtained automatically from existing datasets of different populations (e.g., with atrophy and lesions), thus achieving robustness to a wide range of morphological variability. We demonstrate SynthSeg on 5,500 scans of 6 modalities and 10 resolutions, where it exhibits unparalleled generalisation compared to supervised CNNs, test time adaptation, and Bayesian segmentation. The code and trained model are available at https://github.com/BBillot/SynthSeg.
Purpose: To develop a semi-automated, AI-assisted workflow for segmentation of inflammatory lesions on STIR MRI of sacroiliac joints (SIJs) in adult patients with axial spondyloarthritis. Methods: Baseline human performance in manual segmentation of inflammatory lesions was first established in eight patients with axial spondyloarthritis recruited within a prospective study conducted between April 2018 and July 2019. To improve readers' consistency a semi-automated procedure was developed, comprising (1) manual segmentation of 'normal bone' and 'disease' regions (2) automatic segmentation of lesions, i.e., voxels in the disease region with outlying intensity with respect to the normal bone, and (3) human intervention to remove erroneously segmented areas. Segmentation of disease region (subchondral bone) was automated via supervised deep learning; 200 image slices (eight subjects) were used for algorithm training with cross validation, 48 (two subjects) - for testing and 500 (20 subjects) - for evaluation based on visual assessment. The data, code, and model are available at https://github.com/c-hepburn/Bone_MRI. Human and model performance were assessed in terms of Dice coefficient. Results: Intra-reader median Dice coefficients, evaluated from comparison of manual segmentation trials of inflammatory lesions, were 0.63 and 0.69 for the two readers, respectively. Inter-reader median Dice was in the range of 0.53 to 0.56 and increased to 0.84 using the semi-automated approach. Deep learning model ensemble showed average Dice of 0.94 in subchondral bone segmentation. Conclusions: We describe a semi-automated, AI-assisted workflow which improves the objectivity and consistency of radiological segmentation of inflammatory load in SIJs.
Joint registration of a stack of 2D histological sections to recover 3D structure (3D histology reconstruction) finds application in areas such as atlas building and validation of in vivo imaging. Straighforward pairwise registration of neighbouring sections yields smooth reconstructions but has well-known problems such as banana effect (straightening of curved structures) and z-shift (drift). While these problems can be alleviated with an external, linearly aligned reference (e.g., Magnetic Resonance images), registration is often inaccurate due to contrast differences and the strong nonlinear distortion of the tissue, including artefacts such as folds and tears. In this paper, we present a probabilistic model of spatial deformation that yields reconstructions for multiple histological stains that that are jointly smooth, robust to outliers, and follow the reference shape. The model relies on a spanning tree of latent transforms connecting all the sections and slices, and assumes that the registration between any pair of images can be see as a noisy version of the composition of (possibly inverted) latent transforms connecting the two images. Bayesian inference is used to compute the most likely latent transforms given a set of pairwise registrations between image pairs within and across modalities. Results on synthetic deformations on multiple MR modalities, show that our method can accurately and robustly register multiple contrasts even in the presence of outliers. The 3D histology reconstruction of two stains (Nissl and parvalbumin) from the Allen human brain atlas, show its benefits on real data with severe distortions. We also provide the correspondence to MNI space, bridging the gap between two of the most used atlases in histology and MRI. Data is available at https://openneuro.org/datasets/ds003590 and code at https://github.com/acasamitjana/3dhirest.
Landmark correspondences are a widely used type of gold standard in image registration. However, the manual placement of corresponding points is subject to high inter-user variability in the chosen annotated locations and in the interpretation of visual ambiguities. In this paper, we introduce a principled strategy for the construction of a gold standard in deformable registration. Our framework: (i) iteratively suggests the most informative location to annotate next, taking into account its redundancy with previous annotations; (ii) extends traditional pointwise annotations by accounting for the spatial uncertainty of each annotation, which can either be directly specified by the user, or aggregated from pointwise annotations from multiple experts; and (iii) naturally provides a new strategy for the evaluation of deformable registration algorithms. Our approach is validated on four different registration tasks. The experimental results show the efficacy of suggesting annotations according to their informativeness, and an improved capacity to assess the quality of the outputs of registration algorithms. In addition, our approach yields, from sparse annotations only, a dense visualization of the errors made by a registration method. The source code of our approach supporting both 2D and 3D data is publicly available at https://github.com/LoicPeter/evaluation-deformable-registration.
Video mosaicking requires the registration of overlapping frames located at distant timepoints in the sequence to ensure global consistency of the reconstructed scene. However, fully automated registration of such long-range pairs is (i) challenging when the registration of images itself is difficult; and (ii) computationally expensive for long sequences due to the large number of candidate pairs for registration. In this paper, we introduce an efficient framework for the active annotation of long-range pairwise correspondences in a sequence. Our framework suggests pairs of images that are sought to be informative to an oracle agent (e.g., a human user, or a reliable matching algorithm) who provides visual correspondences on each suggested pair. Informative pairs are retrieved according to an iterative strategy based on a principled annotation reward coupled with two complementary and online adaptable models of frame overlap. In addition to the efficient construction of a mosaic, our framework provides, as a by-product, ground truth landmark correspondences which can be used for evaluation or learning purposes. We evaluate our approach in both automated and interactive scenarios via experiments on synthetic sequences, on a publicly available dataset for aerial imaging and on a clinical dataset for placenta mosaicking during fetal surgery.