Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"cancer detection": models, code, and papers

Automatic Lesion Boundary Segmentation in Dermoscopic Images with Ensemble Deep Learning Methods

Feb 02, 2019
Manu Goyal, Moi Hoon Yap

Early detection of skin cancer, particularly melanoma, is crucial to enable advanced treatment. Due to the rapid growth of skin cancers, there is a growing need of computerized analysis for skin lesions. These processes including detection, classification, and segmentation. The state-of-the-art public available datasets for skin lesions are often accompanied with a very limited amount of segmentation ground truth labeling as it is laborious and expensive. The lesion boundary segmentation is vital to locate the lesion accurately in dermoscopic images and lesion diagnosis of different skin lesion types. In this work, we propose the use of fully automated deep learning ensemble methods for accurate lesion boundary segmentation in dermoscopic images. We trained the Mask-RCNN and DeeplabV3+ methods on ISIC-2017 segmentation training dataset and evaluate the various ensemble performance of both networks on ISIC-2017 testing set, PH2 dataset. Our results showed that the proposed ensemble method segmented the skin lesions with Jaccard index of 79.58% for the ISBI 2017 test dataset. In comparison to FrCN, FCN, U-Net, and SegNet, the proposed ensemble method outperformed them by 2.48%, 7.42%, 17.95%, and 9.96% for the Jaccard index, respectively. Furthermore, the proposed ensemble method achieved a segmentation accuracy of 95.6% for some representative clinical benign cases, 90.78\% for the melanoma cases, and 91.29% for the seborrheic keratosis cases in the ISBI 2017 test dataset, exhibiting better performance than those of FrCN, FCN, U-Net, and SegNet.

* arXiv admin note: text overlap with arXiv:1711.10449 
Access Paper or Ask Questions

Evolving the pulmonary nodules diagnosis from classical approaches to deep learning aided decision support: three decades development course and future prospect

Jan 23, 2019
Bo Liu, Wenhao Chi, Xinran Li, Peng Li, Wenhua Liang, Haiping Liu, Wei Wang, Jianxing He

Lung cancer is the commonest cause of cancer deaths worldwide, and its mortality can be reduced significantly by performing early diagnosis and screening. Since the 1960s, driven by the pressing needs to accurately and effectively interpret the massive volume of chest images generated daily, computer-assisted diagnosis of pulmonary nodule has opened up new opportunities to relax the limitation from physicians subjectivity, experiences and fatigue. It has been witnessed that significant and remarkable advances have been achieved since the 1980s, and consistent endeavors have been exerted to deal with the grand challenges on how to accurately detect the pulmonary nodules with high sensitivity at low false-positives rate as well as on how to precisely differentiate between benign and malignant nodules. The main goal of this investigation is to provide a comprehensive state-of-the-art review of the computer-assisted nodules detection and benign-malignant classification techniques developed over three decades, which have evolved from the complicated ad hoc analysis pipeline of conventional approaches to the simplified seamlessly integrated deep learning techniques. This review also identifies challenges and highlights opportunities for future work in learning models, learning algorithms and enhancement schemes for bridging current state to future prospect and satisfying future demand. As far as the authors know, it is the first review of the literature of the past thirty years development in computer-assisted diagnosis of lung nodules. We acknowledge the value of potential multidisciplinary researches that will make the computer-assisted diagnosis of pulmonary nodules enter into the main stream of clinical medicine, and raise the state-of-the-art clinical applications as well as increase both welfares of physicians and patients.

* 74 pages, 2 figures 
Access Paper or Ask Questions

Normalization of breast MRIs using Cycle-Consistent Generative Adversarial Networks

Dec 16, 2019
Gourav Modanwal, Adithya Vellal, Maciej A. Mazurowski

Dynamic Contrast Enhanced-Magnetic Resonance Imaging (DCE-MRI) is widely used to complement ultrasound examinations and x-ray mammography during the early detection and diagnosis of breast cancer. However, images generated by various MRI scanners (e.g. GE Healthcare vs Siemens) differ both in intensity and noise distribution, preventing algorithms trained on MRIs from one scanner to generalize to data from other scanners successfully. We propose a method for image normalization to solve this problem. MRI normalization is challenging because it requires both normalizing intensity values and mapping between the noise distributions of different scanners. We utilize a cycle-consistent generative adversarial network to learn a bidirectional mapping between MRIs produced by GE Healthcare and Siemens scanners. This allows us learning the mapping between two different scanner types without matched data, which is not commonly available. To ensure the preservation of breast shape and structures within the breast, we propose two technical innovations. First, we incorporate a mutual information loss with the CycleGAN architecture to ensure that the structure of the breast is maintained. Second, we propose a modified discriminator architecture which utilizes a smaller field-of-view to ensure the preservation of finer details in the breast tissue. Quantitative and qualitative evaluations show that the second proposed method was able to consistently preserve a high level of detail in the breast structure while also performing the proper intensity normalization and noise mapping. Our results demonstrate that the proposed model can successfully learn a bidirectional mapping between MRIs produced by different vendors, potentially enabling improved accuracy of downstream computational algorithms for diagnosis and detection of breast cancer.

Access Paper or Ask Questions

WeakSTIL: Weak whole-slide image level stromal tumor infiltrating lymphocyte scores are all you need

Sep 13, 2021
Yoni Schirris, Mendel Engelaer, Andreas Panteli, Hugo Mark Horlings, Efstratios Gavves, Jonas Teuwen

We present WeakSTIL, an interpretable two-stage weak label deep learning pipeline for scoring the percentage of stromal tumor infiltrating lymphocytes (sTIL%) in H&E-stained whole-slide images (WSIs) of breast cancer tissue. The sTIL% score is a prognostic and predictive biomarker for many solid tumor types. However, due to the high labeling efforts and high intra- and interobserver variability within and between expert annotators, this biomarker is currently not used in routine clinical decision making. WeakSTIL compresses tiles of a WSI using a feature extractor pre-trained with self-supervised learning on unlabeled histopathology data and learns to predict precise sTIL% scores for each tile in the tumor bed by using a multiple instance learning regressor that only requires a weak WSI-level label. By requiring only a weak label, we overcome the large annotation efforts required to train currently existing TIL detection methods. We show that WeakSTIL is at least as good as other TIL detection methods when predicting the WSI-level sTIL% score, reaching a coefficient of determination of $0.45\pm0.15$ when compared to scores generated by an expert pathologist, and an AUC of $0.89\pm0.05$ when treating it as the clinically interesting sTIL-high vs sTIL-low classification task. Additionally, we show that the intermediate tile-level predictions of WeakSTIL are highly interpretable, which suggests that WeakSTIL pays attention to latent features related to the number of TILs and the tissue type. In the future, WeakSTIL may be used to provide consistent and interpretable sTIL% predictions to stratify breast cancer patients into targeted therapy arms.

* 8 pages, 8 figures, 1 table, 4 pages supplementary 
Access Paper or Ask Questions

Predicting breast tumor proliferation from whole-slide images: the TUPAC16 challenge

Jul 22, 2018
Mitko Veta, Yujing J. Heng, Nikolas Stathonikos, Babak Ehteshami Bejnordi, Francisco Beca, Thomas Wollmann, Karl Rohr, Manan A. Shah, Dayong Wang, Mikael Rousson, Martin Hedlund, David Tellez, Francesco Ciompi, Erwan Zerhouni, David Lanyi, Matheus Viana, Vassili Kovalev, Vitali Liauchuk, Hady Ahmady Phoulady, Talha Qaiser, Simon Graham, Nasir Rajpoot, Erik Sjöblom, Jesper Molin, Kyunghyun Paeng, Sangheum Hwang, Sunggyun Park, Zhipeng Jia, Eric I-Chao Chang, Yan Xu, Andrew H. Beck, Paul J. van Diest, Josien P. W. Pluim

Tumor proliferation is an important biomarker indicative of the prognosis of breast cancer patients. Assessment of tumor proliferation in a clinical setting is highly subjective and labor-intensive task. Previous efforts to automate tumor proliferation assessment by image analysis only focused on mitosis detection in predefined tumor regions. However, in a real-world scenario, automatic mitosis detection should be performed in whole-slide images (WSIs) and an automatic method should be able to produce a tumor proliferation score given a WSI as input. To address this, we organized the TUmor Proliferation Assessment Challenge 2016 (TUPAC16) on prediction of tumor proliferation scores from WSIs. The challenge dataset consisted of 500 training and 321 testing breast cancer histopathology WSIs. In order to ensure fair and independent evaluation, only the ground truth for the training dataset was provided to the challenge participants. The first task of the challenge was to predict mitotic scores, i.e., to reproduce the manual method of assessing tumor proliferation by a pathologist. The second task was to predict the gene expression based PAM50 proliferation scores from the WSI. The best performing automatic method for the first task achieved a quadratic-weighted Cohen's kappa score of $\kappa$ = 0.567, 95% CI [0.464, 0.671] between the predicted scores and the ground truth. For the second task, the predictions of the top method had a Spearman's correlation coefficient of r = 0.617, 95% CI [0.581 0.651] with the ground truth. This was the first study that investigated tumor proliferation assessment from WSIs. The achieved results are promising given the difficulty of the tasks and weakly-labelled nature of the ground truth. However, further research is needed to improve the practical utility of image analysis methods for this task.

* Overview paper of the TUPAC16 challenge: 
Access Paper or Ask Questions

Cervical Optical Coherence Tomography Image Classification Based on Contrastive Self-Supervised Texture Learning

Aug 11, 2021
Kaiyi Chen, Qingbin Wang, Yutao Ma

Background: Cervical cancer seriously affects the health of the female reproductive system. Optical coherence tomography (OCT) emerges as a non-invasive, high-resolution imaging technology for cervical disease detection. However, OCT image annotation is knowledge-intensive and time-consuming, which impedes the training process of deep-learning-based classification models. Objective: This study aims to develop a computer-aided diagnosis (CADx) approach to classifying in-vivo cervical OCT images based on self-supervised learning. Methods: Besides high-level semantic features extracted by a convolutional neural network (CNN), the proposed CADx approach leverages unlabeled cervical OCT images' texture features learned by contrastive texture learning. We conducted ten-fold cross-validation on the OCT image dataset from a multi-center clinical study on 733 patients from China. Results: In a binary classification task for detecting high-risk diseases, including high-grade squamous intraepithelial lesion (HSIL) and cervical cancer, our method achieved an area-under-the-curve (AUC) value of 0.9798 Plus or Minus 0.0157 with a sensitivity of 91.17 Plus or Minus 4.99% and a specificity of 93.96 Plus or Minus 4.72% for OCT image patches; also, it outperformed two out of four medical experts on the test set. Furthermore, our method achieved a 91.53% sensitivity and 97.37% specificity on an external validation dataset containing 287 3D OCT volumes from 118 Chinese patients in a new hospital using a cross-shaped threshold voting strategy. Conclusion: The proposed contrastive-learning-based CADx method outperformed the end-to-end CNN models and provided better interpretability based on texture features, which holds great potential to be used in the clinical protocol of "see-and-treat."

* 8 pages, 5 figures, and 6 tables 
Access Paper or Ask Questions

Using Machine Learning to Automate Mammogram Images Analysis

Dec 06, 2020
Xuejiao Tang, Liuhua Zhang, Wenbin Zhang, Xin Huang, Vasileios Iosifidis, Zhen Liu, Mingli Zhang, Enza Messina, Ji Zhang

Breast cancer is the second leading cause of cancer-related death after lung cancer in women. Early detection of breast cancer in X-ray mammography is believed to have effectively reduced the mortality rate. However, a relatively high false positive rate and a low specificity in mammography technology still exist. In this work, a computer-aided automatic mammogram analysis system is proposed to process the mammogram images and automatically discriminate them as either normal or cancerous, consisting of three consecutive image processing, feature selection, and image classification stages. In designing the system, the discrete wavelet transforms (Daubechies 2, Daubechies 4, and Biorthogonal 6.8) and the Fourier cosine transform were first used to parse the mammogram images and extract statistical features. Then, an entropy-based feature selection method was implemented to reduce the number of features. Finally, different pattern recognition methods (including the Back-propagation Network, the Linear Discriminant Analysis, and the Naive Bayes Classifier) and a voting classification scheme were employed. The performance of each classification strategy was evaluated for sensitivity, specificity, and accuracy and for general performance using the Receiver Operating Curve. Our method is validated on the dataset from the Eastern Health in Newfoundland and Labrador of Canada. The experimental results demonstrated that the proposed automatic mammogram analysis system could effectively improve the classification performances.

Access Paper or Ask Questions

Segmentation of Infrared Breast Images Using MultiResUnet Neural Network

Oct 31, 2020
Ange Lou, Shuyue Guan, Nada Kamona, Murray Loew

Breast cancer is the second leading cause of death for women in the U.S. Early detection of breast cancer is key to higher survival rates of breast cancer patients. We are investigating infrared (IR) thermography as a noninvasive adjunct to mammography for breast cancer screening. IR imaging is radiation-free, pain-free, and non-contact. Automatic segmentation of the breast area from the acquired full-size breast IR images will help limit the area for tumor search, as well as reduce the time and effort costs of manual segmentation. Autoencoder-like convolutional and deconvolutional neural networks (C-DCNN) had been applied to automatically segment the breast area in IR images in previous studies. In this study, we applied a state-of-the-art deep-learning segmentation model, MultiResUnet, which consists of an encoder part to capture features and a decoder part for precise localization. It was used to segment the breast area by using a set of breast IR images, collected in our pilot study by imaging breast cancer patients and normal volunteers with a thermal infrared camera (N2 Imager). The database we used has 450 images, acquired from 14 patients and 16 volunteers. We used a thresholding method to remove interference in the raw images and remapped them from the original 16-bit to 8-bit, and then cropped and segmented the 8-bit images manually. Experiments using leave-one-out cross-validation (LOOCV) and comparison with the ground-truth images by using Tanimoto similarity show that the average accuracy of MultiResUnet is 91.47%, which is about 2% higher than that of the autoencoder. MultiResUnet offers a better approach to segment breast IR images than our previous model.

* 6 pages. Accepted by IEEE AIPR 2019 (Oral) 
Access Paper or Ask Questions

Skin cancer detection based on deep learning and entropy to detect outlier samples

Sep 10, 2019
Andre G. C. Pacheco, Abder-Rahman Ali, Thomas Trappenberg

We describe our methods to address both tasks of the ISIC 2019 challenge. The goal of this challenge is to provide the diagnostic for skin cancer using images and meta-data. There are nine classes in the dataset, nonetheless, one of them is an outlier and is not present on it. To tackle the challenge, we apply an ensemble of classifiers, which has 13 convolutional neural networks (CNN), we develop two approaches to handle the outlier class and we propose a straightforward method to use the meta-data along with the images. Throughout this report, we detail each methodology and parameters to make it easy to replicate our work. The results obtained are in accordance with the previous challenges and the approaches to detect the outlier class and to address the meta-data seem to be work properly.

Access Paper or Ask Questions

A Generalized Deep Learning Framework for Whole-Slide Image Segmentation and Analysis

Jan 01, 2020
Mahendra Khened, Avinash Kori, Haran Rajkumar, Balaji Srinivasan, Ganapathy Krishnamurthi

Histopathology tissue analysis is considered the gold standard in cancer diagnosis and prognosis. Given the large size of these images and the increase in the number of potential cancer cases, an automated solution as an aid to histopathologists is highly desirable. In the recent past, deep learning-based techniques have provided state of the art results in a wide variety of image analysis tasks, including analysis of digitized slides. However, the size of images and variability in histopathology tasks makes it a challenge to develop an integrated framework for histopathology image analysis. We propose a deep learning-based framework for histopathology tissue analysis. We demonstrate the generalizability of our framework, including training and inference, on several open-source datasets, which include CAMELYON (breast cancer metastases), DigestPath (colon cancer), and PAIP (liver cancer) datasets. We discuss multiple types of uncertainties pertaining to data and model, namely aleatoric and epistemic, respectively. Simultaneously, we demonstrate our model generalization across different data distribution by evaluating some samples on TCGA data. On CAMELYON16 test data (n=139) for the task of lesion detection, the FROC score achieved was 0.86 and in the CAMELYON17 test-data (n=500) for the task of pN-staging the Cohen's kappa score achieved was 0.9090 (third in the open leaderboard). On DigestPath test data (n=212) for the task of tumor segmentation, a Dice score of 0.782 was achieved (fourth in the challenge). On PAIP test data (n=40) for the task of viable tumor segmentation, a Jaccard Index of 0.75 (third in the challenge) was achieved, and for viable tumor burden, a score of 0.633 was achieved (second in the challenge). Our entire framework and related documentation are freely available at GitHub and PyPi.

Access Paper or Ask Questions