Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Veronika Cheplygina

Primary Tumor Origin Classification of Lung Nodules in Spectral CT using Transfer Learning

Jun 30, 2020

Linde S. Hesse, Pim A. de Jong, Josien P. W. Pluim, Veronika Cheplygina

Figure 1 for Primary Tumor Origin Classification of Lung Nodules in Spectral CT using Transfer Learning

Figure 2 for Primary Tumor Origin Classification of Lung Nodules in Spectral CT using Transfer Learning

Figure 3 for Primary Tumor Origin Classification of Lung Nodules in Spectral CT using Transfer Learning

Figure 4 for Primary Tumor Origin Classification of Lung Nodules in Spectral CT using Transfer Learning

Abstract:Early detection of lung cancer has been proven to decrease mortality significantly. A recent development in computed tomography (CT), spectral CT, can potentially improve diagnostic accuracy, as it yields more information per scan than regular CT. However, the shear workload involved with analyzing a large number of scans drives the need for automated diagnosis methods. Therefore, we propose a detection and classification system for lung nodules in CT scans. Furthermore, we want to observe whether spectral images can increase classifier performance. For the detection of nodules we trained a VGG-like 3D convolutional neural net (CNN). To obtain a primary tumor classifier for our dataset we pre-trained a 3D CNN with similar architecture on nodule malignancies of a large publicly available dataset, the LIDC-IDRI dataset. Subsequently we used this pre-trained network as feature extractor for the nodules in our dataset. The resulting feature vectors were classified into two (benign/malignant) and three (benign/primary lung cancer/metastases) classes using support vector machine (SVM). This classification was performed both on nodule- and scan-level. We obtained state-of-the art performance for detection and malignancy regression on the LIDC-IDRI database. Classification performance on our own dataset was higher for scan- than for nodule-level predictions. For the three-class scan-level classification we obtained an accuracy of 78\%. Spectral features did increase classifier performance, but not significantly. Our work suggests that a pre-trained feature extractor can be used as primary tumor origin classifier for lung nodules, eliminating the need for elaborate fine-tuning of a new network and large datasets. Code is available at \url{https://github.com/tueimage/lung-nodule-msc-2018}.

* MSc thesis Linde Hesse

Via

Access Paper or Ask Questions

Risk of Training Diagnostic Algorithms on Data with Demographic Bias

Jun 17, 2020

Samaneh Abbasi-Sureshjani, Ralf Raumanns, Britt E. J. Michels, Gerard Schouten, Veronika Cheplygina

Figure 1 for Risk of Training Diagnostic Algorithms on Data with Demographic Bias

Figure 2 for Risk of Training Diagnostic Algorithms on Data with Demographic Bias

Figure 3 for Risk of Training Diagnostic Algorithms on Data with Demographic Bias

Abstract:One of the critical challenges in machine learning applications is to have fair predictions. There are numerous recent examples in various domains that convincingly show that algorithms trained with biased datasets can easily lead to erroneous or discriminatory conclusions. This is even more crucial in clinical applications where the predictive algorithms are designed mainly based on a limited or given set of medical images and demographic variables such as age, sex and race are not taken into account. In this work, we conduct a survey of the MICCAI 2018 proceedings to investigate the common practice in medical image analysis applications. Surprisingly, we found that papers focusing on diagnosis rarely describe the demographics of the datasets used, and the diagnosis is purely based on images. In order to highlight the importance of considering the demographics in diagnosis tasks, we used a publicly available dataset of skin lesions. We then demonstrate that a classifier with an overall area under the curve (AUC) of 0.83 has variable performance between 0.76 and 0.91 on subgroups based on age and sex, even though the training set was relatively balanced. Moreover, we show that it is possible to learn unbiased features by explicitly using demographic variables in an adversarial training setup, which leads to balanced scores per subgroups. Finally, we discuss the implications of these results and provide recommendations for further research.

Via

Access Paper or Ask Questions

Predicting Scores of Medical Imaging Segmentation Methods with Meta-Learning

May 08, 2020

Tom van Sonsbeek, Veronika Cheplygina

Figure 1 for Predicting Scores of Medical Imaging Segmentation Methods with Meta-Learning

Figure 2 for Predicting Scores of Medical Imaging Segmentation Methods with Meta-Learning

Figure 3 for Predicting Scores of Medical Imaging Segmentation Methods with Meta-Learning

Figure 4 for Predicting Scores of Medical Imaging Segmentation Methods with Meta-Learning

Abstract:Deep learning has led to state-of-the-art results for many medical imaging tasks, such as segmentation of different anatomical structures. With the increased numbers of deep learning publications and openly available code, the approach to choosing a model for a new task becomes more complicated, while time and (computational) resources are limited. A possible solution to choosing a model efficiently is meta-learning, a learning method in which prior performance of a model is used to predict the performance for new tasks. We investigate meta-learning for segmentation across ten datasets of different organs and modalities. We propose four ways to represent each dataset by meta-features: one based on statistical features of the images and three are based on deep learning features. We use support vector regression and deep neural networks to learn the relationship between the meta-features and prior model performance. On three external test datasets these methods give Dice scores within 0.10 of the true performance. These results demonstrate the potential of meta-learning in medical imaging.

Via

Access Paper or Ask Questions

Multi-task Learning with Crowdsourced Features Improves Skin Lesion Diagnosis

Apr 28, 2020

Ralf Raumanns, Elif K Contar, Gerard Schouten, Veronika Cheplygina

Figure 1 for Multi-task Learning with Crowdsourced Features Improves Skin Lesion Diagnosis

Figure 2 for Multi-task Learning with Crowdsourced Features Improves Skin Lesion Diagnosis

Figure 3 for Multi-task Learning with Crowdsourced Features Improves Skin Lesion Diagnosis

Figure 4 for Multi-task Learning with Crowdsourced Features Improves Skin Lesion Diagnosis

Abstract:Machine learning has a recognised need for large amounts of annotated data. Due to the high cost of expert annotations, crowdsourcing, where non-experts are asked to label or outline images, has been proposed as an alternative. Although many promising results are reported, the quality of diagnostic crowdsourced labels is still lacking. We propose to address this by instead asking the crowd about visual features of the images, which can be provided more intuitively, and by using these features in a multi-task learning framework. We compare our proposed approach to a baseline model with a set of 2000 skin lesions from the ISIC 2017 challenge dataset. The baseline model only predicts a binary label from the skin lesion image, while our multi-task model also predicts one of the following features: asymmetry of the lesion, border irregularity and color. We show that crowd features in combination with multi-task learning leads to improved generalisation. The area under the receiver operating characteristic curve is 0.754 for the baseline model and 0.782, 0.785 and 0.789 for multi-task models with border, color and asymmetry respectively. Finally, we discuss the findings, identify some limitations and recommend directions for further research.

Via

Access Paper or Ask Questions

A Survey of Crowdsourcing in Medical Image Analysis

Feb 25, 2019

Silas Ørting, Andrew Doyle, Matthias Hirth Arno van Hilten, Oana Inel, Christopher R. Madan, Panagiotis Mavridis, Helen Spiers, Veronika Cheplygina

Figure 1 for A Survey of Crowdsourcing in Medical Image Analysis

Figure 2 for A Survey of Crowdsourcing in Medical Image Analysis

Abstract:Rapid advances in image processing capabilities have been seen across many domains, fostered by the application of machine learning algorithms to "big-data". However, within the realm of medical image analysis, advances have been curtailed, in part, due to the limited availability of large-scale, well-annotated datasets. One of the main reasons for this is the high cost often associated with producing large amounts of high-quality meta-data. Recently, there has been growing interest in the application of crowdsourcing for this purpose; a technique that has proven effective for creating large-scale datasets across a range of disciplines, from computer vision to astrophysics. Despite the growing popularity of this approach, there has not yet been a comprehensive literature review to provide guidance to researchers considering using crowdsourcing methodologies in their own medical imaging analysis. In this survey, we review studies applying crowdsourcing to the analysis of medical images, published prior to July 2018. We identify common approaches, challenges and considerations, providing guidance of utility to researchers adopting this approach. Finally, we discuss future opportunities for development within this emerging domain.

* While this paper is a preprint, we welcome feedback from other researchers, which we will aim to incorporate in the journal version. Interested researchers can submit comments via https://goo.gl/forms/Qzr2yAJQjOnRCAF23

Via

Access Paper or Ask Questions

Cats or CAT scans: transfer learning from natural or medical image source datasets?

Oct 12, 2018

Veronika Cheplygina

Figure 1 for Cats or CAT scans: transfer learning from natural or medical image source datasets?

Figure 2 for Cats or CAT scans: transfer learning from natural or medical image source datasets?

Abstract:Transfer learning is a widely used strategy in medical image analysis. Instead of only training a network with a limited amount of data from the target task of interest, we can first train the network with other, potentially larger source datasets, creating a more robust model. The source datasets do not have to be related to the target task. For a classification task in lung CT images, we could use both head CT images, or images of cats, as the source. While head CT images appear more similar to lung CT images, the number and diversity of cat images might lead to a better model overall. In this survey we review a number of papers that have performed similar comparisons. Although the answer to which strategy is best seems to be "it depends", we discuss a number of research directions we need to take as a community, to gain more understanding of this topic.

Via

Access Paper or Ask Questions

Automatic Emphysema Detection using Weakly Labeled HRCT Lung Images

Oct 01, 2018

Isabel Pino Peña, Veronika Cheplygina, Sofia Paschaloudi, Morten Vuust, Jesper Carl, Ulla Møller Weinreich, Lasse Riis Østergaard, Marleen de Bruijne

Figure 1 for Automatic Emphysema Detection using Weakly Labeled HRCT Lung Images

Figure 2 for Automatic Emphysema Detection using Weakly Labeled HRCT Lung Images

Figure 3 for Automatic Emphysema Detection using Weakly Labeled HRCT Lung Images

Figure 4 for Automatic Emphysema Detection using Weakly Labeled HRCT Lung Images

Abstract:A method for automatically quantifying emphysema regions using High-Resolution Computed Tomography (HRCT) scans of patients with chronic obstructive pulmonary disease (COPD) that does not require manually annotated scans for training is presented. HRCT scans of controls and of COPD patients with diverse disease severity are acquired at two different centers. Textural features from co-occurrence matrices and Gaussian filter banks are used to characterize the lung parenchyma in the scans. Two robust versions of multiple instance learning (MIL) classifiers, miSVM and MILES, are investigated. The classifiers are trained with the weak labels extracted from the forced expiratory volume in one minute (FEV$_1$) and diffusing capacity of the lungs for carbon monoxide (DLCO). At test time, the classifiers output a patient label indicating overall COPD diagnosis and local labels indicating the presence of emphysema. The classifier performance is compared with manual annotations by two radiologists, a classical density based method, and pulmonary function tests (PFTs). The miSVM classifier performed better than MILES on both patient and emphysema classification. The classifier has a stronger correlation with PFT than the density based method, the percentage of emphysema in the intersection of annotations from both radiologists, and the percentage of emphysema annotated by one of the radiologists. The correlation between the classifier and the PFT is only outperformed by the second radiologist. The method is therefore promising for facilitating assessment of emphysema and reducing inter-observer variability.

* Accepted at PLoS ONE

Via

Access Paper or Ask Questions

Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis

Sep 14, 2018

Veronika Cheplygina, Marleen de Bruijne, Josien P. W. Pluim

Figure 1 for Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis

Figure 2 for Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis

Figure 3 for Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis

Figure 4 for Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis

Abstract:Machine learning (ML) algorithms have made a tremendous impact in the field of medical imaging. While medical imaging datasets have been growing in size, a challenge for supervised ML algorithms that is frequently mentioned is the lack of annotated data. As a result, various methods which can learn with less/other types of supervision, have been proposed. We review semi-supervised, multiple instance, and transfer learning in medical imaging, both in diagnosis/detection or segmentation tasks. We also discuss connections between these learning scenarios, and opportunities for future research.

* Submitted to Medical Image Analysis

Via

Access Paper or Ask Questions

Crowd disagreement about medical images is informative

Aug 17, 2018

Veronika Cheplygina, Josien P. W. Pluim

Figure 1 for Crowd disagreement about medical images is informative

Figure 2 for Crowd disagreement about medical images is informative

Figure 3 for Crowd disagreement about medical images is informative

Figure 4 for Crowd disagreement about medical images is informative

Abstract:Classifiers for medical image analysis are often trained with a single consensus label, based on combining labels given by experts or crowds. However, disagreement between annotators may be informative, and thus removing it may not be the best strategy. As a proof of concept, we predict whether a skin lesion from the ISIC 2017 dataset is a melanoma or not, based on crowd annotations of visual characteristics of that lesion. We compare using the mean annotations, illustrating consensus, to standard deviations and other distribution moments, illustrating disagreement. We show that the mean annotations perform best, but that the disagreement measures are still informative. We also make the crowd annotations used in this paper available at \url{https://figshare.com/s/5cbbce14647b66286544}.

* Accepted for publication at MICCAI LABELS 2018

Via

Access Paper or Ask Questions

Characterizing multiple instance datasets

Jun 21, 2018

Veronika Cheplygina, David M. J. Tax

Figure 1 for Characterizing multiple instance datasets

Figure 2 for Characterizing multiple instance datasets

Figure 3 for Characterizing multiple instance datasets

Figure 4 for Characterizing multiple instance datasets

Abstract:In many pattern recognition problems, a single feature vector is not sufficient to describe an object. In multiple instance learning (MIL), objects are represented by sets (\emph{bags}) of feature vectors (\emph{instances}). This requires an adaptation of standard supervised classifiers in order to train and evaluate on these bags of instances. Like for supervised classification, several benchmark datasets and numerous classifiers are available for MIL. When performing a comparison of different MIL classifiers, it is important to understand the differences of the datasets, used in the comparison. Seemingly different (based on factors such as dimensionality) datasets may elicit very similar behaviour in classifiers, and vice versa. This has implications for what kind of conclusions may be drawn from the comparison results. We aim to give an overview of the variability of available benchmark datasets and some popular MIL classifiers. We use a dataset dissimilarity measure, based on the differences between the ROC-curves obtained by different classifiers, and embed this dataset dissimilarity matrix into a low-dimensional space. Our results show that conceptually similar datasets can behave very differently. We therefore recommend examining such dataset characteristics when making comparisons between existing and new MIL classifiers. The datasets are available via Figshare at \url{https://bit.ly/2K9iTja}.

* Published at SIMBAD 2015 workshop

Via

Access Paper or Ask Questions