Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"cancer detection": models, code, and papers

Cancer image classification based on DenseNet model

Nov 23, 2020
Ziliang Zhong, Muhang Zheng, Huafeng Mai, Jianan Zhao, Xinyi Liu

Computer-aided diagnosis establishes methods for robust assessment of medical image-based examination. Image processing introduced a promising strategy to facilitate disease classification and detection while diminishing unnecessary expenses. In this paper, we propose a novel metastatic cancer image classification model based on DenseNet Block, which can effectively identify metastatic cancer in small image patches taken from larger digital pathology scans. We evaluate the proposed approach to the slightly modified version of the PatchCamelyon (PCam) benchmark dataset. The dataset is the slightly modified version of the PatchCamelyon (PCam) benchmark dataset provided by Kaggle competition, which packs the clinically-relevant task of metastasis detection into a straight-forward binary image classification task. The experiments indicated that our model outperformed other classical methods like Resnet34, Vgg19. Moreover, we also conducted data augmentation experiment and study the relationship between Batches processed and loss value during the training and validation process.

* 2004-present Journal of Physics: Conference Series 
Access Paper or Ask Questions

Data Augmentation for Detection of Architectural Distortion in Digital Mammography using Deep Learning Approach

Jul 06, 2018
Arthur C. Costa, Helder C. R. Oliveira, Juliana H. Catani, Nestor de Barros, Carlos F. E. Melo, Marcelo A. C. Vieira

Early detection of breast cancer can increase treatment efficiency. Architectural Distortion (AD) is a very subtle contraction of the breast tissue and may represent the earliest sign of cancer. Since it is very likely to be unnoticed by radiologists, several approaches have been proposed over the years but none using deep learning techniques. To train a Convolutional Neural Network (CNN), which is a deep neural architecture, is necessary a huge amount of data. To overcome this problem, this paper proposes a data augmentation approach applied to clinical image dataset to properly train a CNN. Results using receiver operating characteristic analysis showed that with a very limited dataset we could train a CNN to detect AD in digital mammography with area under the curve (AUC = 0.74).

Access Paper or Ask Questions

Joint 2D-3D Breast Cancer Classification

Feb 27, 2020
Gongbo Liang, Xiaoqin Wang, Yu Zhang, Xin Xing, Hunter Blanton, Tawfiq Salem, Nathan Jacobs

Breast cancer is the malignant tumor that causes the highest number of cancer deaths in females. Digital mammograms (DM or 2D mammogram) and digital breast tomosynthesis (DBT or 3D mammogram) are the two types of mammography imagery that are used in clinical practice for breast cancer detection and diagnosis. Radiologists usually read both imaging modalities in combination; however, existing computer-aided diagnosis tools are designed using only one imaging modality. Inspired by clinical practice, we propose an innovative convolutional neural network (CNN) architecture for breast cancer classification, which uses both 2D and 3D mammograms, simultaneously. Our experiment shows that the proposed method significantly improves the performance of breast cancer classification. By assembling three CNN classifiers, the proposed model achieves 0.97 AUC, which is 34.72% higher than the methods using only one imaging modality.

* Accepted by IEEE International Conference of Bioinformatics and Biomedicine (BIBM), 2019 
Access Paper or Ask Questions

Scalable privacy-preserving cancer type prediction with homomorphic encryption

Apr 12, 2022
Esha Sarkar, Eduardo Chielle, Gamze Gursoy, Leo Chen, Mark Gerstein, Michail Maniatakos

Machine Learning (ML) alleviates the challenges of high-dimensional data analysis and improves decision making in critical applications like healthcare. Effective cancer type from high-dimensional genetic mutation data can be useful for cancer diagnosis and treatment, if the distinguishable patterns between cancer types are identified. At the same time, analysis of high-dimensional data is computationally expensive and is often outsourced to cloud services. Privacy concerns in outsourced ML, especially in the field of genetics, motivate the use of encrypted computation, like Homomorphic Encryption (HE). But restrictive overheads of encrypted computation deter its usage. In this work, we explore the challenges of privacy preserving cancer detection using a real-world dataset consisting of more than 2 million genetic information for several cancer types. Since the data is inherently high-dimensional, we explore smaller ML models for cancer prediction to enable fast inference in the privacy preserving domain. We develop a solution for privacy preserving cancer inference which first leverages the domain knowledge on somatic mutations to efficiently encode genetic mutations and then uses statistical tests for feature selection. Our logistic regression model, built using our novel encoding scheme, achieves 0.98 micro-average area under curve with 13% higher test accuracy than similar studies. We exhaustively test our model's predictive capabilities by analyzing the genes used by the model. Furthermore, we propose a fast matrix multiplication algorithm that can efficiently handle high-dimensional data. Experimental results show that, even with 40,000 features, our proposed matrix multiplication algorithm can speed up concurrent inference of multiple individuals by approximately 10x and inference of a single individual by approximately 550x, in comparison to standard matrix multiplication.

Access Paper or Ask Questions

Machine Intelligence-Driven Classification of Cancer Patients-Derived Extracellular Vesicles using Fluorescence Correlation Spectroscopy: Results from a Pilot Study

Feb 01, 2022
Abicumaran Uthamacumaran, Mohamed Abdouh, Kinshuk Sengupta, Zu-hua Gao, Stefano Forte, Thupten Tsering, Julia V Burnier, Goffredo Arena

Patient-derived extracellular vesicles (EVs) that contains a complex biological cargo is a valuable source of liquid biopsy diagnostics to aid in early detection, cancer screening, and precision nanotherapeutics. In this study, we predicted that coupling cancer patient blood-derived EVs to time-resolved spectroscopy and artificial intelligence (AI) could provide a robust cancer screening and follow-up tools. Methods: Fluorescence correlation spectroscopy (FCS) measurements were performed on 24 blood samples-derived EVs. Blood samples were obtained from 15 cancer patients (presenting 5 different types of cancers), and 9 healthy controls (including patients with benign lesions). The obtained FCS autocorrelation spectra were processed into power spectra using the Fast-Fourier Transform algorithm and subjected to various machine learning algorithms to distinguish cancer spectra from healthy control spectra. Results and Applications: The performance of AdaBoost Random Forest (RF) classifier, support vector machine, and multilayer perceptron, were tested on selected frequencies in the N=118 power spectra. The RF classifier exhibited a 90% classification accuracy and high sensitivity and specificity in distinguishing the FCS power spectra of cancer patients from those of healthy controls. Further, an image convolutional neural network (CNN), ResNet network, and a quantum CNN were assessed on the power spectral images as additional validation tools. All image-based CNNs exhibited a nearly equal classification performance with an accuracy of roughly 82% and reasonably high sensitivity and specificity scores. Our pilot study demonstrates that AI-algorithms coupled to time-resolved FCS power spectra can accurately and differentially classify the complex patient-derived EVs from different cancer samples of distinct tissue subtypes.

* 23 pages, 6 figures 
Access Paper or Ask Questions

Scalable Reinforcement-Learning-Based Neural Architecture Search for Cancer Deep Learning Research

Sep 01, 2019
Prasanna Balaprakash, Romain Egele, Misha Salim, Stefan Wild, Venkatram Vishwanath, Fangfang Xia, Tom Brettin, Rick Stevens

Cancer is a complex disease, the understanding and treatment of which are being aided through increases in the volume of collected data and in the scale of deployed computing power. Consequently, there is a growing need for the development of data-driven and, in particular, deep learning methods for various tasks such as cancer diagnosis, detection, prognosis, and prediction. Despite recent successes, however, designing high-performing deep learning models for nonimage and nontext cancer data is a time-consuming, trial-and-error, manual task that requires both cancer domain and deep learning expertise. To that end, we develop a reinforcement-learning-based neural architecture search to automate deep-learning-based predictive model development for a class of representative cancer data. We develop custom building blocks that allow domain experts to incorporate the cancer-data-specific characteristics. We show that our approach discovers deep neural network architectures that have significantly fewer trainable parameters, shorter training time, and accuracy similar to or higher than those of manually designed architectures. We study and demonstrate the scalability of our approach on up to 1,024 Intel Knights Landing nodes of the Theta supercomputer at the Argonne Leadership Computing Facility.

* SC '19: IEEE/ACM International Conference on High Performance Computing, Networking, Storage and Analysis, November 17--22, 2019, Denver, CO 
Access Paper or Ask Questions

Self-Supervised U-Net for Segmenting Flat and Sessile Polyps

Oct 17, 2021
Debayan Bhattacharya, Christian Betz, Dennis Eggert, Alexander Schlaefer

Colorectal Cancer(CRC) poses a great risk to public health. It is the third most common cause of cancer in the US. Development of colorectal polyps is one of the earliest signs of cancer. Early detection and resection of polyps can greatly increase survival rate to 90%. Manual inspection can cause misdetections because polyps vary in color, shape, size and appearance. To this end, Computer-Aided Diagnosis systems(CADx) has been proposed that detect polyps by processing the colonoscopic videos. The system acts a secondary check to help clinicians reduce misdetections so that polyps may be resected before they transform to cancer. Polyps vary in color, shape, size, texture and appearance. As a result, the miss rate of polyps is between 6% and 27% despite the prominence of CADx solutions. Furthermore, sessile and flat polyps which have diameter less than 10 mm are more likely to be undetected. Convolutional Neural Networks(CNN) have shown promising results in polyp segmentation. However, all of these works have a supervised approach and are limited by the size of the dataset. It was observed that smaller datasets reduce the segmentation accuracy of ResUNet++. We train a U-Net to inpaint randomly dropped out pixels in the image as a proxy task. The dataset we use for pre-training is Kvasir-SEG dataset. This is followed by a supervised training on the limited Kvasir-Sessile dataset. Our experimental results demonstrate that with limited annotated dataset and a larger unlabeled dataset, self-supervised approach is a better alternative than fully supervised approach. Specifically, our self-supervised U-Net performs better than five segmentation models which were trained in supervised manner on the Kvasir-Sessile dataset.

Access Paper or Ask Questions

Primary Tumor Origin Classification of Lung Nodules in Spectral CT using Transfer Learning

Jun 30, 2020
Linde S. Hesse, Pim A. de Jong, Josien P. W. Pluim, Veronika Cheplygina

Early detection of lung cancer has been proven to decrease mortality significantly. A recent development in computed tomography (CT), spectral CT, can potentially improve diagnostic accuracy, as it yields more information per scan than regular CT. However, the shear workload involved with analyzing a large number of scans drives the need for automated diagnosis methods. Therefore, we propose a detection and classification system for lung nodules in CT scans. Furthermore, we want to observe whether spectral images can increase classifier performance. For the detection of nodules we trained a VGG-like 3D convolutional neural net (CNN). To obtain a primary tumor classifier for our dataset we pre-trained a 3D CNN with similar architecture on nodule malignancies of a large publicly available dataset, the LIDC-IDRI dataset. Subsequently we used this pre-trained network as feature extractor for the nodules in our dataset. The resulting feature vectors were classified into two (benign/malignant) and three (benign/primary lung cancer/metastases) classes using support vector machine (SVM). This classification was performed both on nodule- and scan-level. We obtained state-of-the art performance for detection and malignancy regression on the LIDC-IDRI database. Classification performance on our own dataset was higher for scan- than for nodule-level predictions. For the three-class scan-level classification we obtained an accuracy of 78\%. Spectral features did increase classifier performance, but not significantly. Our work suggests that a pre-trained feature extractor can be used as primary tumor origin classifier for lung nodules, eliminating the need for elaborate fine-tuning of a new network and large datasets. Code is available at \url{}.

* MSc thesis Linde Hesse 
Access Paper or Ask Questions

Genetic Deep Learning for Lung Cancer Screening

Jul 27, 2019
Hunter Park, Connor Monahan

Convolutional neural networks (CNNs) have shown great promise in improving computer aided detection (CADe). From classifying tumors found via mammography as benign or malignant to automated detection of colorectal polyps in CT colonography, these advances have helped reduce the need for further evaluation with invasive testing and prevent errors from missed diagnoses by acting as a second observer in today's fast paced and high volume clinical environment. CADe methods have become faster and more precise thanks to innovations in deep learning over the past several years. With advancements such as the inception module and utilization of residual connections, the approach to designing CNN architectures has become an art. It is customary to use proven models and fine tune them for particular tasks given a dataset, often requiring tedious work. We investigated using a genetic algorithm (GA) to conduct a neural architectural search (NAS) to generate a novel CNN architecture to find early stage lung cancer in chest x-rays (CXR). Using a dataset of over twelve thousand biopsy proven cases of lung cancer, the trained classification model achieved an accuracy of 97.15% with a PPV of 99.88% and a NPV of 94.81%, beating models such as Inception-V3 and ResNet-152 while simultaneously reducing the number of parameters a factor of 4 and 14, respectively.

Access Paper or Ask Questions

Dual Skip Connections Minimize the False Positive Rate of Lung Nodule Detection in CT images

Oct 25, 2021
Jiahua Xu, Philipp Ernst, Tung Lung Liu, Andreas Nürnberger

Pulmonary cancer is one of the most commonly diagnosed and fatal cancers and is often diagnosed by incidental findings on computed tomography. Automated pulmonary nodule detection is an essential part of computer-aided diagnosis, which is still facing great challenges and difficulties to quickly and accurately locate the exact nodules' positions. This paper proposes a dual skip connection upsampling strategy based on Dual Path network in a U-Net structure generating multiscale feature maps, which aims to minimize the ratio of false positives and maximize the sensitivity for lesion detection of nodules. The results show that our new upsampling strategy improves the performance by having 85.3% sensitivity at 4 FROC per image compared to 84.2% for the regular upsampling strategy or 81.2% for VGG16-based Faster-R-CNN.

* to be published at IEEE EMBC 2021, in IEEE Xplore 
Access Paper or Ask Questions