The emerging area of computational pathology (CPath) is ripe ground for the application of deep learning (DL) methods to healthcare due to the sheer volume of raw pixel data in whole-slide images (WSIs) of cancerous tissue slides. However, it is imperative for the DL algorithms relying on nuclei-level details to be able to cope with data from `the clinical wild', which tends to be quite challenging. We study, and extend recently released PanNuke dataset consisting of $\sim$200,000 nuclei categorized into 5 clinically important classes for the challenging tasks of segmenting and classifying nuclei in WSIs \cite{gamper_pannuke:_2019}. Previous pan-cancer datasets consisted of only up to 9 different tissues and up to 21,000 unlabeled nuclei \cite{kumar2019multi} and just over 24,000 labeled nuclei with segmentation masks \cite{graham2019hover}. PanNuke consists of 19 different tissue types that have been semi-automatically annotated and quality controlled by clinical pathologists, leading to a dataset with statistics similar to `the clinical wild' and with minimal selection bias. We study the performance of segmentation and classification models when applied to the proposed dataset and demonstrate the application of models trained on PanNuke to whole-slide images. We provide comprehensive statistics about the dataset and outline recommendations and research directions to address the limitations of existing DL tools when applied to real-world CPath applications.
To train a robust deep learning model, one usually needs a balanced set of categories in the training data. The data acquired in a medical domain, however, frequently contains an abundance of healthy patients, versus a small variety of positive, abnormal cases. Moreover, the annotation of a positive sample requires time consuming input from medical domain experts. This scenario would suggest a promise for one-class classification type approaches. In this work we propose a general one-class classification model for histology, that is meta-trained on multiple histology datasets simultaneously, and can be applied to new tasks without expensive re-training. This model could be easily used by pathology domain experts, and potentially be used for screening purposes.
The debate on AI ethics largely focuses on technical improvements and stronger regulation to prevent accidents or misuse of AI, with solutions relying on holding individual actors accountable for responsible AI development. While useful and necessary, we argue that this "agency" approach disregards more indirect and complex risks resulting from AI's interaction with the socio-economic and political context. This paper calls for a "structural" approach to assessing AI's effects in order to understand and prevent such systemic risks where no individual can be held accountable for the broader negative impacts. This is particularly relevant for AI applied to systemic issues such as climate change and food security which require political solutions and global cooperation. To properly address the wide range of AI risks and ensure 'AI for social good', agency-focused policies must be complemented by policies informed by a structural approach.
The potential of using remote sensing imagery for environmental modelling and for providing real time support to humanitarian operations such as hurricane relief efforts is well established. These applications are substantially affected by missing data due to non-structural noise such as clouds, shadows and other atmospheric effects. In this work we probe the potential of applying a cycle-consistent latent variable deep generative model (DGM) for denoising cloudy Sentinel-2 observations conditioned on the information in cloud penetrating bands. We adapt the recently proposed Fr\'{e}chet Distance metric to remote sensing images for evaluating performance of the generator, demonstrate the potential of DGMs for conditional denoising, and discuss future directions as well as the limitations of DGMs in Earth science and humanitarian applications.