Generalizability is the ultimate goal of Machine Learning (ML) image classifiers, for which noise and limited dataset size are among the major concerns. We tackle these challenges through utilizing the framework of deep Multitask Learning (dMTL) and incorporating image depth estimation as an auxiliary task. On a customized and depth-augmented derivation of the MNIST dataset, we show a) multitask loss functions are the most effective approach of implementing dMTL, b) limited dataset size primarily contributes to classification inaccuracy, and c) depth estimation is mostly impacted by noise. In order to further validate the results, we manually labeled the NYU Depth V2 dataset for scene classification tasks. As a contribution to the field, we have made the data in python native format publicly available as an open-source dataset and provided the scene labels. Our experiments on MNIST and NYU-Depth-V2 show dMTL improves generalizability of the classifiers when the dataset is noisy and the number of examples is limited.
Traditional datasets for the radiological diagnosis tend to only provide the radiology image alongside the radiology report. However, radiology reading as performed by radiologists is a complex process, and information such as the radiologist's eye-fixations over the course of the reading has the potential to be an invaluable data source to learn from. Nonetheless, the collection of such data is expensive and time-consuming. This leads to the question of whether such data is worth the investment to collect. This paper utilizes the recently published Eye-Gaze dataset to perform an exhaustive study on the impact on performance and explainability of deep learning (DL) classification in the face of varying levels of input features, namely: radiology images, radiology report text, and radiologist eye-gaze data. We find that the best classification performance of X-ray images is achieved with a combination of radiology report free-text and radiology image, with the eye-gaze data providing no performance boost. Nonetheless, eye-gaze data serving as secondary ground truth alongside the class label results in highly explainable models that generate better attention maps compared to models trained to do classification and attention map generation without eye-gaze data.
The application of artificial intelligence (AI) techniques to medical imaging data has yielded promising results. As an important branch of AI pipelines in medical imaging, radiomics faces two major challenges namely reproducibility and accessibility. In this work, we introduce open-radiomics, a set of radiomics datasets, and a comprehensive radiomics pipeline that investigates the effects of radiomics feature extraction settings such as binWidth and image normalization on the reproducibility of the radiomics results performance. To make radiomics research more accessible and reproducible, we provide guidelines for building machine learning (ML) models on radiomics data, introduce Open-radiomics, an evolving collection of open-source radiomics datasets, and publish baseline models for the datasets.
As deep learning is widely used in the radiology field, the explainability of such models is increasingly becoming essential to gain clinicians' trust when using the models for diagnosis. In this research, three experiment sets were conducted with a U-Net architecture to improve the classification performance while enhancing the heatmaps corresponding to the model's focus through incorporating heatmap generators during training. All of the experiments used the dataset that contained chest radiographs, associated labels from one of the three conditions ("normal", "congestive heart failure (CHF)", and "pneumonia"), and numerical information regarding a radiologist's eye-gaze coordinates on the images. The paper (A. Karargyris and Moradi, 2021) that introduced this dataset developed a U-Net model, which was treated as the baseline model for this research, to show how the eye-gaze data can be used in multi-modal training for explainability improvement. To compare the classification performances, the 95% confidence intervals (CI) of the area under the receiver operating characteristic curve (AUC) were measured. The best method achieved an AUC of 0.913 (CI: 0.860-0.966). The greatest improvements were for the "pneumonia" and "CHF" classes, which the baseline model struggled most to classify, resulting in AUCs of 0.859 (CI: 0.732-0.957) and 0.962 (CI: 0.933-0.989), respectively. The proposed method's decoder was also able to produce probability masks that highlight the determining image parts in model classifications, similarly as the radiologist's eye-gaze data. Hence, this work showed that incorporating heatmap generators and eye-gaze information into training can simultaneously improve disease classification and provide explainable visuals that align well with how the radiologist viewed the chest radiographs when making diagnosis.
The COVID-19 pandemic has affected lives of people from different countries for almost two years. The changes on lifestyles due to the pandemic may cause psychosocial stressors for individuals, and have a potential to lead to mental health problems. To provide high quality mental health supports, healthcare organization need to identify the COVID-19 specific stressors, and notice the trends of prevalence of those stressors. This study aims to apply natural language processing (NLP) on social media data to identify the psychosocial stressors during COVID-19 pandemic, and to analyze the trend on prevalence of stressors at different stages of the pandemic. We obtained dataset of 9266 Reddit posts from subreddit \rCOVID19_support, from 14th Feb ,2020 to 19th July 2021. We used Latent Dirichlet Allocation (LDA) topic model and lexicon methods to identify the topics that were mentioned on the subreddit. Our result presented a dashboard to visualize the trend of prevalence of topics about covid-19 related stressors being discussed on social media platform. The result could provide insights about the prevalence of pandemic related stressors during different stages of COVID-19. The NLP techniques leveraged in this study could also be applied to analyze event specific stressors in the future.
Brain tumor segmentation is a critical task for tumor volumetric analyses and AI algorithms. However, it is a time-consuming process and requires neuroradiology expertise. While there has been extensive research focused on optimizing brain tumor segmentation in the adult population, studies on AI guided pediatric tumor segmentation are scarce. Furthermore, MRI signal characteristics of pediatric and adult brain tumors differ, necessitating the development of segmentation algorithms specifically designed for pediatric brain tumors. We developed a segmentation model trained on magnetic resonance imaging (MRI) of pediatric patients with low-grade gliomas (pLGGs) from The Hospital for Sick Children (Toronto, Ontario, Canada). The proposed model utilizes deep Multitask Learning (dMTL) by adding tumor's genetic alteration classifier as an auxiliary task to the main network, ultimately improving the accuracy of the segmentation results.
Deep convolutional neural networks (CNNs) have become an essential tool in the medical imaging-based computer-aided diagnostic pipeline. However, training accurate and reliable CNNs requires large fine-grain annotated datasets. To alleviate this, weakly-supervised methods can be used to obtain local information from global labels. This work proposes the use of localized perturbations as a weakly-supervised solution to extract segmentation masks of brain tumours from a pretrained 3D classification model. Furthermore, we propose a novel optimal perturbation method that exploits 3D superpixels to find the most relevant area for a given classification using a U-net architecture. Our method achieved a Dice similarity coefficient (DSC) of 0.44 when compared with expert annotations. When compared against Grad-CAM, our method outperformed both in visualization and localization ability of the tumour region, with Grad-CAM only achieving 0.11 average DSC.
Generative Adversarial Networks can learn the mapping of random noise to realistic images in a semi-supervised framework. This mapping ability can be used for semi-supervised image classification to detect images of an unknown class where there is no training data to be used for supervised classification. However, if the unknown class shares similar characteristics to the known class(es), GANs can learn to generalize and generate images that look like both classes. This generalization ability can hinder the classification performance. In this work, we propose the Vanishing Twin GAN. By training a weak GAN and using its generated output image parallel to the regular GAN, the Vanishing Twin training improves semi-supervised image classification where image similarity can hurt classification tasks.
From generating never-before-seen images to domain adaptation, applications of Generative Adversarial Networks (GANs) spread wide in the domain of vision and graphics problems. With the remarkable ability of GANs in learning the distribution and generating images of a particular class, they can be used for semi-supervised classification tasks. However, the problem is that if two classes of images share similar characteristics, the GAN might learn to generalize and hinder the classification of the two classes. In this paper, we use various images from MNIST and Fashion-MNIST datasets to illustrate how similar images cause the GAN to generalize, leading to the poor classification of images. We propose a modification to the traditional training of GANs that allows for improved multi-class classification in similar classes of images in a semi-supervised learning framework.