Accuracy and consistency are two key factors in computer-assisted magnetic resonance (MR) image analysis. However, contrast variation from site to site caused by lack of standardization in MR acquisition impedes consistent measurements. In recent years, image harmonization approaches have been proposed to compensate for contrast variation in MR images. Current harmonization approaches either require cross-site traveling subjects for supervised training or heavily rely on site-specific harmonization models to encourage harmonization accuracy. These requirements potentially limit the application of current harmonization methods in large-scale multi-site studies. In this work, we propose an unsupervised MR harmonization framework, CALAMITI (Contrast Anatomy Learning and Analysis for MR Intensity Translation and Integration), based on information bottleneck theory. CALAMITI learns a disentangled latent space using a unified structure for multi-site harmonization without the need for traveling subjects. Our model is also able to adapt itself to harmonize MR images from a new site with fine tuning solely on images from the new site. Both qualitative and quantitative results show that the proposed method achieves superior performance compared with other unsupervised harmonization approaches.
Precision medicine involves answering counterfactual questions such as "Would this patient respond better to treatment A or treatment B?" These types of questions are causal in nature and require the tools of causal inference to be answered, e.g., with a structural causal model (SCM). In this work, we develop an SCM that models the interaction between demographic information, disease covariates, and magnetic resonance (MR) images of the brain for people with multiple sclerosis (MS). Inference in the SCM generates counterfactual images that show what an MR image of the brain would look like when demographic or disease covariates are changed. These images can be used for modeling disease progression or used for downstream image processing tasks where controlling for confounders is necessary.
Tagged magnetic resonance imaging (MRI) is a widely used imaging technique for measuring tissue deformation in moving organs. Due to tagged MRI's intrinsic low anatomical resolution, another matching set of cine MRI with higher resolution is sometimes acquired in the same scanning session to facilitate tissue segmentation, thus adding extra time and cost. To mitigate this, in this work, we propose a novel dual-cycle constrained bijective VAE-GAN approach to carry out tagged-to-cine MR image synthesis. Our method is based on a variational autoencoder backbone with cycle reconstruction constrained adversarial training to yield accurate and realistic cine MR images given tagged MR images. Our framework has been trained, validated, and tested using 1,768, 416, and 1,560 subject-independent paired slices of tagged and cine MRI from twenty healthy subjects, respectively, demonstrating superior performance over the comparison methods. Our method can potentially be used to reduce the extra acquisition time and cost, while maintaining the same workflow for further motion analyses.
Since its renaissance, deep learning has been widely used in various medical imaging tasks and has achieved remarkable success in many medical imaging applications, thereby propelling us into the so-called artificial intelligence (AI) era. It is known that the success of AI is mostly attributed to the availability of big data with annotations for a single task and the advances in high performance computing. However, medical imaging presents unique challenges that confront deep learning approaches. In this survey paper, we first highlight both clinical needs and technical challenges in medical imaging and describe how emerging trends in deep learning are addressing these issues. We cover the topics of network architecture, sparse and noisy labels, federating learning, interpretability, uncertainty quantification, etc. Then, we present several case studies that are commonly found in clinical practice, including digital pathology and chest, brain, cardiovascular, and abdominal imaging. Rather than presenting an exhaustive literature survey, we instead describe some prominent research highlights related to these case study applications. We conclude with a discussion and presentation of promising future directions.
Intelligible speech is produced by creating varying internal local muscle groupings---i.e., functional units---that are generated in a systematic and coordinated manner. There are two major challenges in characterizing and analyzing functional units. First, due to the complex and convoluted nature of tongue structure and function, it is of great importance to develop a method that can accurately decode complex muscle coordination patterns during speech. Second, it is challenging to keep identified functional units across subjects comparable due to their substantial variability. In this work, to address these challenges, we develop a new deep learning framework to identify common and subject-specific functional units of tongue motion during speech. Our framework hinges on joint deep graph-regularized sparse non-negative matrix factorization (NMF) using motion quantities derived from displacements by tagged Magnetic Resonance Imaging. More specifically, we transform NMF with sparse and manifold regularizations into modular architectures akin to deep neural networks by means of unfolding the Iterative Shrinkage-Thresholding Algorithm to learn interpretable building blocks and associated weighting map. We then apply spectral clustering to common and subject-specific functional units. Experiments carried out with simulated datasets show that the proposed method surpasses the comparison methods. Experiments carried out with in vivo tongue motion datasets show that the proposed method can determine the common and subject-specific functional units with increased interpretability and decreased size variability.
Domain shift is a major problem for deploying deep networks in clinical practice. Network performance drops significantly with (target) images obtained differently than its (source) training data. Due to a lack of target label data, most work has focused on unsupervised domain adaptation (UDA). Current UDA methods need both source and target data to train models which perform image translation (harmonization) or learn domain-invariant features. However, training a model for each target domain is time consuming and computationally expensive, even infeasible when target domain data are scarce or source data are unavailable due to data privacy. In this paper, we propose a novel self domain adapted network (SDA-Net) that can rapidly adapt itself to a single test subject at the testing stage, without using extra data or training a UDA model. The SDA-Net consists of three parts: adaptors, task model, and auto-encoders. The latter two are pre-trained offline on labeled source images. The task model performs tasks like synthesis, segmentation, or classification, which may suffer from the domain shift problem. At the testing stage, the adaptors are trained to transform the input test image and features to reduce the domain shift as measured by the auto-encoders, and thus perform domain adaptation. We validated our method on retinal layer segmentation from different OCT scanners and T1 to T2 synthesis with T1 from different MRI scanners and with different imaging parameters. Results show that our SDA-Net, with a single test subject and a short amount of time for self adaptation at the testing stage, can achieve significant improvements.
Medical images are increasingly used as input to deep neural networks to produce quantitative values that aid researchers and clinicians. However, standard deep neural networks do not provide a reliable measure of uncertainty in those quantitative values. Recent work has shown that using dropout during training and testing can provide estimates of uncertainty. In this work, we investigate using dropout to estimate epistemic and aleatoric uncertainty in a CT-to-MR image translation task. We show that both types of uncertainty are captured, as defined, providing confidence in the output uncertainty estimates.
Medical images are often used to detect and characterize pathology and disease; however, automatically identifying and segmenting pathology in medical images is challenging because the appearance of pathology across diseases varies widely. To address this challenge, we propose a Bayesian deep learning method that learns to translate healthy computed tomography images to magnetic resonance images and simultaneously calculates voxel-wise uncertainty. Since high uncertainty occurs in pathological regions of the image, this uncertainty can be used for unsupervised anomaly segmentation. We show encouraging experimental results on an unsupervised anomaly segmentation task by combining two types of uncertainty into a novel quantity we call scibilic uncertainty.
Image synthesis learns a transformation from the intensity features of an input image to yield a different tissue contrast of the output image. This process has been shown to have application in many medical image analysis tasks including imputation, registration, and segmentation. To carry out synthesis, the intensities of the input images are typically scaled--i.e., normalized--both in training to learn the transformation and in testing when applying the transformation, but it is not presently known what type of input scaling is optimal. In this paper, we consider seven different intensity normalization algorithms and three different synthesis methods to evaluate the impact of normalization. Our experiments demonstrate that intensity normalization as a preprocessing step improves the synthesis results across all investigated synthesis algorithms. Furthermore, we show evidence that suggests intensity normalization is vital for successful deep learning-based MR image synthesis.