In this work we propose a deep learning network for deformable image registration (DIRNet). The DIRNet consists of a convolutional neural network (ConvNet) regressor, a spatial transformer, and a resampler. The ConvNet analyzes a pair of fixed and moving images and outputs parameters for the spatial transformer, which generates the displacement vector field that enables the resampler to warp the moving image to the fixed image. The DIRNet is trained end-to-end by unsupervised optimization of a similarity metric between input image pairs. A trained DIRNet can be applied to perform registration on unseen image pairs in one pass, thus non-iteratively. Evaluation was performed with registration of images of handwritten digits (MNIST) and cardiac cine MR scans (Sunnybrook Cardiac Data). The results demonstrate that registration with DIRNet is as accurate as a conventional deformable image registration method with substantially shorter execution times.
Accurate delineation of the left ventricle (LV) is an important step in evaluation of cardiac function. In this paper, we present an automatic method for segmentation of the LV in cardiac CT angiography (CCTA) scans. Segmentation is performed in two stages. First, a bounding box around the LV is detected using a combination of three convolutional neural networks (CNNs). Subsequently, to obtain the segmentation of the LV, voxel classification is performed within the defined bounding box using a CNN. The study included CCTA scans of sixty patients, fifty scans were used to train the CNNs for the LV localization, five scans were used to train LV segmentation and the remaining five scans were used for testing the method. Automatic segmentation resulted in the average Dice coefficient of 0.85 and mean absolute surface distance of 1.1 mm. The results demonstrate that automatic segmentation of the LV in CCTA scans using voxel classification with convolutional neural networks is feasible.
Localization of anatomical structures is a prerequisite for many tasks in medical image analysis. We propose a method for automatic localization of one or more anatomical structures in 3D medical images through detection of their presence in 2D image slices using a convolutional neural network (ConvNet). A single ConvNet is trained to detect presence of the anatomical structure of interest in axial, coronal, and sagittal slices extracted from a 3D image. To allow the ConvNet to analyze slices of different sizes, spatial pyramid pooling is applied. After detection, 3D bounding boxes are created by combining the output of the ConvNet in all slices. In the experiments 200 chest CT, 100 cardiac CT angiography (CTA), and 100 abdomen CT scans were used. The heart, ascending aorta, aortic arch, and descending aorta were localized in chest CT scans, the left cardiac ventricle in cardiac CTA scans, and the liver in abdomen CT scans. Localization was evaluated using the distances between automatically and manually defined reference bounding box centroids and walls. The best results were achieved in localization of structures with clearly defined boundaries (e.g. aortic arch) and the worst when the structure boundary was not clearly visible (e.g. liver). The method was more robust and accurate in localization multiple structures.
We propose an automatic method using dilated convolutional neural networks (CNNs) for segmentation of the myocardium and blood pool in cardiovascular MR (CMR) of patients with congenital heart disease (CHD). Ten training and ten test CMR scans cropped to an ROI around the heart were provided in the MICCAI 2016 HVSMR challenge. A dilated CNN with a receptive field of 131x131 voxels was trained for myocardium and blood pool segmentation in axial, sagittal and coronal image slices. Performance was evaluated within the HVSMR challenge. Automatic segmentation of the test scans resulted in Dice indices of 0.80$\pm$0.06 and 0.93$\pm$0.02, average distances to boundaries of 0.96$\pm$0.31 and 0.89$\pm$0.24 mm, and Hausdorff distances of 6.13$\pm$3.76 and 7.07$\pm$3.01 mm for the myocardium and blood pool, respectively. Segmentation took 41.5$\pm$14.7 s per scan. In conclusion, dilated CNNs trained on a small set of CMR images of CHD patients showing large anatomical variability provide accurate myocardium and blood pool segmentations.
Automatic segmentation of medical images is an important task for many clinical applications. In practice, a wide range of anatomical structures are visualised using different imaging modalities. In this paper, we investigate whether a single convolutional neural network (CNN) can be trained to perform different segmentation tasks. A single CNN is trained to segment six tissues in MR brain images, the pectoral muscle in MR breast images, and the coronary arteries in cardiac CTA. The CNN therefore learns to identify the imaging modality, the visualised anatomical structures, and the tissue classes. For each of the three tasks (brain MRI, breast MRI and cardiac CTA), this combined training procedure resulted in a segmentation performance equivalent to that of a CNN trained specifically for that task, demonstrating the high capacity of CNN architectures. Hence, a single system could be used in clinical practice to automatically perform diverse segmentation tasks without task-specific training.
Automatic segmentation in MR brain images is important for quantitative analysis in large-scale studies with images acquired at all ages. This paper presents a method for the automatic segmentation of MR brain images into a number of tissue classes using a convolutional neural network. To ensure that the method obtains accurate segmentation details as well as spatial consistency, the network uses multiple patch sizes and multiple convolution kernel sizes to acquire multi-scale information about each voxel. The method is not dependent on explicit features, but learns to recognise the information that is important for the classification based on training data. The method requires a single anatomical MR image only. The segmentation method is applied to five different data sets: coronal T2-weighted images of preterm infants acquired at 30 weeks postmenstrual age (PMA) and 40 weeks PMA, axial T2- weighted images of preterm infants acquired at 40 weeks PMA, axial T1-weighted images of ageing adults acquired at an average age of 70 years, and T1-weighted images of young adults acquired at an average age of 23 years. The method obtained the following average Dice coefficients over all segmented tissue classes for each data set, respectively: 0.87, 0.82, 0.84, 0.86 and 0.91. The results demonstrate that the method obtains accurate segmentations in all five sets, and hence demonstrates its robustness to differences in age and acquisition protocol.
The proliferative activity of breast tumors, which is routinely estimated by counting of mitotic figures in hematoxylin and eosin stained histology sections, is considered to be one of the most important prognostic markers. However, mitosis counting is laborious, subjective and may suffer from low inter-observer agreement. With the wider acceptance of whole slide images in pathology labs, automatic image analysis has been proposed as a potential solution for these issues. In this paper, the results from the Assessment of Mitosis Detection Algorithms 2013 (AMIDA13) challenge are described. The challenge was based on a data set consisting of 12 training and 11 testing subjects, with more than one thousand annotated mitotic figures by multiple observers. Short descriptions and results from the evaluation of eleven methods are presented. The top performing method has an error rate that is comparable to the inter-observer agreement among pathologists.