To facilitate both the detection and the interpretation of findings in chest X-rays, comparison with a previous image of the same patient is very valuable to radiologists. Today, the most common approach for deep learning methods to automatically inspect chest X-rays disregards the patient history and classifies only single images as normal or abnormal. Nevertheless, several methods for assisting in the task of comparison through image registration have been proposed in the past. However, as we illustrate, they tend to miss specific types of pathological changes like cardiomegaly and effusion. Due to assumptions on fixed anatomical structures or their measurements of registration quality, they produce unnaturally deformed warp fields impacting visualization of differences between moving and fixed images. We aim to overcome these limitations, through a new paradigm based on individual rib pair segmentation for anatomy penalized registration. Our method proves to be a natural way to limit the folding percentage of the warp field to 1/6 of the state of the art while increasing the overlap of ribs by more than 25%, implying difference images showing pathological changes overlooked by other methods. We develop an anatomically penalized convolutional multi-stage solution on the National Institutes of Health (NIH) data set, starting from less than 25 fully and 50 partly labeled training images, employing sequential instance memory segmentation with hole dropout, weak labeling, coarse-to-fine refinement and Gaussian mixture model histogram matching. We statistically evaluate the benefits of our method and highlight the limits of currently used metrics for registration of chest X-rays.
Unsupervised anomaly detection models which are trained solely by healthy data, have gained importance in the recent years, as the annotation of medical data is a tedious task. Autoencoders and generative adversarial networks are the standard anomaly detection methods that are utilized to learn the data distribution. However, they fall short when it comes to inference and evaluation of the likelihood of test samples. We propose a novel combination of generative models and a probabilistic graphical model. After encoding image samples by autoencoders, the distribution of data is modeled by Random and Tensorized Sum-Product Networks ensuring exact and efficient inference at test time. We evaluate different autoencoder architectures in combination with Random and Tensorized Sum-Product Networks on mammography images using patch-wise processing and observe superior performance over utilizing the models standalone and state-of-the-art in anomaly detection for medical data.
Machine learning and deep learning methods have become essential for computer-assisted prediction in medicine, with a growing number of applications also in the field of mammography. Typically these algorithms are trained for a specific task, e.g., the classification of lesions or the prediction of a mammogram's pathology status. To obtain a comprehensive view of a patient, models which were all trained for the same task(s) are subsequently ensembled or combined. In this work, we propose a pipeline approach, where we first train a set of individual, task-specific models and subsequently investigate the fusion thereof, which is in contrast to the standard model ensembling strategy. We fuse model predictions and high-level features from deep learning models with hybrid patient models to build stronger predictors on patient level. To this end, we propose a multi-branch deep learning model which efficiently fuses features across different tasks and mammograms to obtain a comprehensive patient-level prediction. We train and evaluate our full pipeline on public mammography data, i.e., DDSM and its curated version CBIS-DDSM, and report an AUC score of 0.962 for predicting the presence of any lesion and 0.791 for predicting the presence of malignant lesions on patient level. Overall, our fusion approaches improve AUC scores significantly by up to 0.04 compared to standard model ensembling. Moreover, by providing not only global patient-level predictions but also task-specific model results that are related to radiological features, our pipeline aims to closely support the reading workflow of radiologists.
Tumor segmentation in multimodal medical images has seen a growing trend towards deep learning based methods. Typically, studies dealing with this topic fuse multimodal image data to improve the tumor segmentation contour for a single imaging modality. However, they do not take into account that tumor characteristics are emphasized differently by each modality, which affects the tumor delineation. Thus, the tumor segmentation is modality- and task-dependent. This is especially the case for soft tissue sarcomas, where, due to necrotic tumor tissue, the segmentation differs vastly. Closing this gap, we develop a modalityspecific sarcoma segmentation model that utilizes multimodal image data to improve the tumor delineation on each individual modality. We propose a simultaneous co-segmentation method, which enables multimodal feature learning through modality-specific encoder and decoder branches, and the use of resource-effcient densely connected convolutional layers. We further conduct experiments to analyze how different input modalities and encoder-decoder fusion strategies affect the segmentation result. We demonstrate the effectiveness of our approach on public soft tissue sarcoma data, which comprises MRI (T1 and T2 sequence) and PET/CT scans. The results show that our multimodal co-segmentation model provides better modality-specific tumor segmentation than models using only the PET or MRI (T1 and T2) scan as input.
The success of machine learning methods for computer vision tasks has driven a surge in computer assisted prediction for medicine and biology. Based on a data-driven relationship between input image and pathological classification, these predictors deliver unprecedented accuracy. Yet, the numerous approaches trying to explain the causality of this learned relationship have fallen short: time constraints, coarse, diffuse and at times misleading results, caused by the employment of heuristic techniques like Gaussian noise and blurring, have hindered their clinical adoption. In this work, we discuss and overcome these obstacles by introducing a neural-network based attribution method, applicable to any trained predictor. Our solution identifies salient regions of an input image in a single forward-pass by measuring the effect of local image-perturbations on a predictor's score. We replace heuristic techniques with a strong neighborhood conditioned inpainting approach, avoiding anatomically implausible, hence adversarial artifacts. We evaluate on public mammography data and compare against existing state-of-the-art methods. Furthermore, we exemplify the approach's generalizability by demonstrating results on chest X-rays. Our solution shows, both quantitatively and qualitatively, a significant reduction of localization ambiguity and clearer conveying results, without sacrificing time efficiency.
Clinical applicability of automated decision support systems depends on a robust, well-understood classification interpretation. Artificial neural networks while achieving class-leading scores fall short in this regard. Therefore, numerous approaches have been proposed that map a salient region of an image to a diagnostic classification. Utilizing heuristic methodology, like blurring and noise, they tend to produce diffuse, sometimes misleading results, hindering their general adoption. In this work we overcome these issues by presenting a model agnostic saliency mapping framework tailored to medical imaging. We replace heuristic techniques with a strong neighborhood conditioned inpainting approach, which avoids anatomically implausible artefacts. We formulate saliency attribution as a map-quality optimization task, enforcing constrained and focused attributions. Experiments on public mammography data show quantitatively and qualitatively more precise localization and clearer conveying results than existing state-of-the-art methods.
Segmentation in 3D scans is playing an increasingly important role in current clinical practice supporting diagnosis, tissue quantification, or treatment planning. The current 3D approaches based on CNN usually suffer from at least three main issues caused predominantly by implementation constraints - first, they require resizing the volume to the lower-resolutional reference dimensions, second, the capacity of such approaches is very limited due to memory restrictions, and third, all slices of volumes have to be available at any given training or testing time. We address these problems by a U-Net-like architecture consisting of bidirectional Convolutional LSTM and convolutional, pooling, upsampling and concatenation layers enclosed into time-distributed wrappers. Our network can either process the full volumes in a sequential manner, or segment slabs of slices on demand. We demonstrate performance of our architecture on vertebrae and liver segmentation tasks in 3D CT scans.
The success of deep convolutional neural networks on image classification and recognition tasks has led to new applications in very diversified contexts, including the field of medical imaging. In this paper we investigate and propose neural network architectures for automated multi-class segmentation of anatomical organs in chest radiographs, namely for lungs, clavicles and heart. We address several open challenges including model overfitting, reducing number of parameters and handling of severely imbalanced data in CXR by fusing recent concepts in convolutional networks and adapting them to the segmentation problem task in CXR. We demonstrate that our architecture combining delayed subsampling, exponential linear units, highly restrictive regularization and a large number of high resolution low level abstract features outperforms state-of-the-art methods on all considered organs, as well as the human observer on lungs and heart. The models use a multi-class configuration with three target classes and are trained and tested on the publicly available JSRT database, consisting of 247 X-ray images the ground-truth masks for which are available in the SCR database. Our best performing model, trained with the loss function based on the Dice coefficient, reached mean Jaccard overlap scores of 95.0\% for lungs, 86.8\% for clavicles and 88.2\% for heart. This architecture outperformed the human observer results for lungs and heart.