We introduce the notion of semantic image quality for applications where image quality relies on semantic requirements. Working in fetal ultrasound, where ranking is challenging and annotations are noisy, we design a robust coarse-to-fine model that ranks images based on their semantic image quality and endow our predicted rankings with an uncertainty estimate. To annotate rankings on training data, we design an efficient ranking annotation scheme based on the merge sort algorithm. Finally, we compare our ranking algorithm to a number of state-of-the-art ranking algorithms on a challenging fetal ultrasound quality assessment task, showing the superior performance of our method on the majority of rank correlation metrics.
The identification and localisation of pathological tissues in medical images continues to command much attention among deep learning practitioners. When trained on abundant datasets, deep neural networks can match or exceed human performance. However, the scarcity of annotated data complicates the training of these models. Data augmentation techniques can compensate for a lack of training samples. However, many commonly used augmentation methods can fail to provide meaningful samples during model fitting. We present local gamma augmentation, a technique for introducing new instances of intensities in pathological tissues. We leverage local gamma augmentation to compensate for a bias in intensities corresponding to ischemic stroke lesions in human brain MRIs. On three datasets, we show how local gamma augmentation can improve the image-level sensitivity of a deep neural network tasked with ischemic lesion segmentation on magnetic resonance images.
The current state-of-the art techniques for image segmentation are often based on U-Net architectures, a U-shaped encoder-decoder networks with skip connections. Despite the powerful performance, the architecture often does not perform well when used on data which has different characteristics than the data it was trained on. Many techniques for improving performance in the presence of domain shift have been developed, however typically only have loose connections to the theory of domain adaption. In this work, we propose an unsupervised domain adaptation framework for U-Nets with theoretical guarantees based on the Margin Disparity Discrepancy [1] called the MDD-UNet. We evaluate the proposed technique on the task of hippocampus segmentation, and find that the MDD-UNet is able to learn features which are domain-invariant with no knowledge about the labels in the target domain. The MDD-UNet improves performance over the standard U-Net on 11 out of 12 combinations of datasets. This work serves as a proof of concept by demonstrating an improvement on the U-Net in it's standard form without modern enhancements, which opens up a new avenue of studying domain adaptation for models with very large hypothesis spaces from both methodological and practical perspectives. Code is available at https://github.com/asbjrnmunk/mdd-unet.
Deep learning-based models in medical imaging often struggle to generalize effectively to new scans due to data heterogeneity arising from differences in hardware, acquisition parameters, population, and artifacts. This limitation presents a significant challenge in adopting machine learning models for clinical practice. We propose an unsupervised method for robust domain adaptation in brain MRI segmentation by leveraging MRI-specific augmentation techniques. To evaluate the effectiveness of our method, we conduct extensive experiments across diverse datasets, modalities, and segmentation tasks, comparing against the state-of-the-art methods. The results show that our proposed approach achieves high accuracy, exhibits broad applicability, and showcases remarkable robustness against domain shift in various tasks, surpassing the state-of-the-art performance in the majority of cases.
Cerebral Microbleeds (CMBs), typically captured as hypointensities from susceptibility-weighted imaging (SWI), are particularly important for the study of dementia, cerebrovascular disease, and normal aging. Recent studies on COVID-19 have shown an increase in CMBs of coronavirus cases. Automatic detection of CMBs is challenging due to the small size and amount of CMBs making the classes highly imbalanced, lack of publicly available annotated data, and similarity with CMB mimics such as calcifications, irons, and veins. Hence, the existing deep learning methods are mostly trained on very limited research data and fail to generalize to unseen data with high variability and cannot be used in clinical setups. To this end, we propose an efficient 3D deep learning framework that is actively trained on multi-domain data. Two public datasets assigned for normal aging, stroke, and Alzheimer's disease analysis as well as an in-house dataset for COVID-19 assessment are used to train and evaluate the models. The obtained results show that the proposed method is robust to low-resolution images and achieves 78% recall and 80% precision on the entire test set with an average false positive of 1.6 per scan.
Purpose: Risk-stratified breast cancer screening might improve early detection and efficiency without comprising quality. However, modern mammography-based risk models do not ensure adaptation across vendor-domains and rely on cancer precursors, associated with short-term risk, which might limit long-term risk assessment. We report a cross-vendor mammographic texture model for long-term risk. Approach: The texture model was robustly trained using two systematically designed case-control datasets. Textural features, indicative of future breast cancer, were learned by excluding samples with diagnosed/potential malignancies from training. An augmentation-based domain adaption technique, based on flavorization of mammographic views, ensured generalization across vendor-domains. The model was validated in 66,607 consecutively screened Danish women with flavorized Siemens views and 25,706 Dutch women with Hologic-processed views. Performances were evaluated for interval cancers (IC) within two years from screening and long-term cancers (LTC) from two years after screening. The texture model was combined with established risk factors to flag 10% of women with the highest risk. Results: In Danish women, the texture model achieved an area under the receiver operating characteristic (AUC) of 0.71 and 0.65 for ICs and LTCs, respectively. In Dutch women with Hologic-processed views, the AUCs were not different from AUCs in Danish women with flavorized views. The AUC for texture combined with established risk factors increased to 0.68 for LTCs. The 10% of women flagged as high-risk accounted for 25.5% of ICs and 24.8% of LTCs. Conclusions: The texture model robustly estimated long-term breast cancer risk while adapting to an unseen processed vendor-domain and identified a clinically relevant high-risk subgroup.
Medical images used in clinical practice are heterogeneous and not the same quality as scans studied in academic research. Preprocessing breaks down in extreme cases when anatomy, artifacts, or imaging parameters are unusual or protocols are different. Methods robust to these variations are most needed. A novel deep learning method is proposed for fast and accurate segmentation of the human brain into 132 regions. The proposed model uses an efficient U-Net-like network and benefits from the intersection points of different views and hierarchical relations for the fusion of the orthogonal 2D planes and brain labels during the end-to-end training. Weakly supervised learning is deployed to take the advantage of partially labeled data for the whole brain segmentation and estimation of the intracranial volume (ICV). Moreover, data augmentation is used to expand the magnetic resonance imaging (MRI) data by generating realistic brain scans with high variability for robust training of the model while preserving data privacy. The proposed method can be applied to brain MRI data including skull or any other artifacts without preprocessing the images or a drop in performance. Several experiments using different atlases are conducted to evaluate the segmentation performance of the trained model compared to the state-of-the-art, and the results show higher segmentation accuracy and robustness of the proposed model compared to the existing methods across different intra- and inter-domain datasets.
The insertion of deep learning in medical image analysis had lead to the development of state-of-the art strategies in several applications such a disease classification, as well as abnormality detection and segmentation. However, even the most advanced methods require a huge and diverse amount of data to generalize. Because in realistic clinical scenarios, data acquisition and annotation is expensive, deep learning models trained on small and unrepresentative data tend to outperform when deployed in data that differs from the one used for training (e.g data from different scanners). In this work, we proposed a domain adaptation methodology to alleviate this problem in segmentation models. Our approach takes advantage of the properties of adversarial domain adaptation and consistency training to achieve more robust adaptation. Using two datasets with white matter hyperintensities (WMH) annotations, we demonstrated that the proposed method improves model generalization even in corner cases where individual strategies tend to fail.
The information bottleneck (IB) principle has been suggested as a way to analyze deep neural networks. The learning dynamics are studied by inspecting the mutual information (MI) between the hidden layers and the input and output. Notably, separate fitting and compression phases during training have been reported. This led to some controversy including claims that the observations are not reproducible and strongly dependent on the type of activation function used as well as on the way the MI is estimated. Our study confirms that different ways of binning when computing the MI lead to qualitatively different results, either supporting or refusing IB conjectures. To resolve the controversy, we study the IB principle in settings where MI is non-trivial and can be computed exactly. We monitor the dynamics of quantized neural networks, that is, we discretize the whole deep learning system so that no approximation is required when computing the MI. This allows us to quantify the information flow without measurement errors. In this setting, we observed a fitting phase for all layers and a compression phase for the output layer in all experiments; the compression in the hidden layers was dependent on the type of activation function. Our study shows that the initial IB results were not artifacts of binning when computing the MI. However, the critical claim that the compression phase may not be observed for some networks also holds true.
Learning temporal patterns from multivariate longitudinal data is challenging especially in cases when data is sporadic, as often seen in, e.g., healthcare applications where the data can suffer from irregularity and asynchronicity as the time between consecutive data points can vary across features and samples, hindering the application of existing deep learning models that are constructed for complete, evenly spaced data with fixed sequence lengths. In this paper, a novel deep learning-based model is developed for modeling multiple temporal features in sporadic data using an integrated deep learning architecture based on a recurrent neural network (RNN) unit and a continuous-time autoregressive (CAR) model. The proposed model, called CARRNN, uses a generalized discrete-time autoregressive model that is trainable end-to-end using neural networks modulated by time lags to describe the changes caused by the irregularity and asynchronicity. It is applied to multivariate time-series regression tasks using data provided for Alzheimer's disease progression modeling and intensive care unit (ICU) mortality rate prediction, where the proposed model based on a gated recurrent unit (GRU) achieves the lowest prediction errors among the proposed RNN-based models and state-of-the-art methods using GRUs and long short-term memory (LSTM) networks in their architecture.