Segmentation of Multiple Sclerosis (MS) lesions in longitudinal brain MR scans is performed for monitoring the progression of MS lesions. In order to improve segmentation, we use spatio-temporal cues in longitudinal data. To that end, we propose two approaches: Our longitudinal segmentation architecture which is grounded upon early-fusion of longitudinal data. And complementary to the longitudinal architecture, we propose a novel multi-task learning approach by defining an auxiliary self-supervised task of deformable registration between two time-points to guide the neural network toward learning from spatio-temporal changes. We show the effectiveness of our methods on two datasets: An in-house dataset comprised of 70 patients with one follow-up study for each patient and the ISBI longitudinal MS lesion segmentation challenge dataset which has 19 patients with three to five follow-up studies. Our results show that spatio-temporal information in longitudinal data is a beneficial cue for improving segmentation. Code is publicly available.
In this paper we introduce the first reinforcement learning (RL) based robotic navigation method which utilizes ultrasound (US) images as an input. Our approach combines state-of-the-art RL techniques, specifically deep Q-networks (DQN) with memory buffers and a binary classifier for deciding when to terminate the task. Our method is trained and evaluated on an in-house collected data-set of 34 volunteers and when compared to pure RL and supervised learning (SL) techniques, it performs substantially better, which highlights the suitability of RL navigation for US-guided procedures. When testing our proposed model, we obtained a 82.91% chance of navigating correctly to the sacrum from 165 different starting positions on 5 different unseen simulated environments.
Recent advances in deep learning have resulted in great successes in various applications. Although semi-supervised or unsupervised learning methods have been widely investigated, the performance of deep neural networks highly depends on the annotated data. The problem is that the budget for annotation is usually limited due to the annotation time and expensive annotation cost in medical data. Active learning is one of the solutions to this problem where an active learner is designed to indicate which samples need to be annotated to effectively train a target model. In this paper, we propose a novel active learning method, confident coreset, which considers both uncertainty and distribution for effectively selecting informative samples. By comparative experiments on two medical image analysis tasks, we show that our method outperforms other active learning methods.
Current deep-learning based methods do not easily integrate to clinical protocols, neither take full advantage of medical knowledge. In this work, we propose and compare several strategies relying on curriculum learning, to support the classification of proximal femur fracture from X-ray images, a challenging problem as reflected by existing intra- and inter-expert disagreement. Our strategies are derived from knowledge such as medical decision trees and inconsistencies in the annotations of multiple experts, which allows us to assign a degree of difficulty to each training sample. We demonstrate that if we start learning "easy" examples and move towards "hard", the model can reach a better performance, even with fewer data. The evaluation is performed on the classification of a clinical dataset of about 1000 X-ray images. Our results show that, compared to class-uniform and random strategies, the proposed medical knowledge-based curriculum, performs up to 15% better in terms of accuracy, achieving the performance of experienced trauma surgeons.
Computer-aided diagnosis (CADx) algorithms in medicine provide patient-specific decision support for physicians. These algorithms are usually applied after full acquisition of high-dimensional multimodal examination data, and often assume feature-completeness. This, however, is rarely the case due to examination costs, invasiveness, or a lack of indication. A sub-problem in CADx, which to our knowledge has received very little attention among the CADx community so far, is to guide the physician during the entire peri-diagnostic workflow, including the acquisition stage. We model the following question, asked from a physician's perspective: ''Given the evidence collected so far, which examination should I perform next, in order to achieve the most accurate and efficient diagnostic prediction?''. In this work, we propose a novel approach which is enticingly simple: use dropout at the input layer, and integrated gradients of the trained network at test-time to attribute feature importance dynamically. We validate and explain the effectiveness of our proposed approach using two public medical and two synthetic datasets. Results show that our proposed approach is more cost- and feature-efficient than prior approaches and achieves a higher overall accuracy. This directly translates to less unnecessary examinations for patients, and a quicker, less costly and more accurate decision support for the physician.
Recently, Graph Convolutional Networks (GCNs) has proven to be a powerful machine learning tool for Computer Aided Diagnosis (CADx) and disease prediction. A key component in these models is to build a population graph, where the graph adjacency matrix represents pair-wise patient similarities. Until now, the similarity metrics have been defined manually, usually based on meta-features like demographics or clinical scores. The definition of the metric, however, needs careful tuning, as GCNs are very sensitive to the graph structure. In this paper, we demonstrate for the first time in the CADx domain that it is possible to learn a single, optimal graph towards the GCN's downstream task of disease classification. To this end, we propose a novel, end-to-end trainable graph learning architecture for dynamic and localized graph pruning. Unlike commonly employed spectral GCN approaches, our GCN is spatial and inductive, and can thus infer previously unseen patients as well. We demonstrate significant classification improvements with our learned graph on two CADx problems in medicine. We further explain and visualize this result using an artificial dataset, underlining the importance of graph learning for more accurate and robust inference with GCNs in medical applications.
Automatic surgical phase recognition is a challenging and crucial task with the potential to improve patient safety and become an integral part of intra-operative decision-support systems. In this paper, we propose, for the first time in workflow analysis, a Multi-Stage Temporal Convolutional Network (MS-TCN) that performs hierarchical prediction refinement for surgical phase recognition. Causal, dilated convolutions allow for a large receptive field and online inference with smooth predictions even during ambiguous transitions. Our method is thoroughly evaluated on two datasets of laparoscopic cholecystectomy videos with and without the use of additional surgical tool information. Outperforming various state-of-the-art LSTM approaches, we verify the suitability of the proposed causal MS-TCN for surgical phase recognition.
Learning discriminative powerful representations is a crucial step for machine learning systems. Introducing invariance against arbitrary nuisance or sensitive attributes while performing well on specific tasks is an important problem in representation learning. This is mostly approached by purging the sensitive information from learned representations. In this paper, we propose a novel disentanglement approach to invariant representation problem. We disentangle the meaningful and sensitive representations by enforcing orthogonality constraints as a proxy for independence. We explicitly enforce the meaningful representation to be agnostic to sensitive information by entropy maximization. The proposed approach is evaluated on five publicly available datasets and compared with state of the art methods for learning fairness and invariance achieving the state of the art performance on three datasets and comparable performance on the rest. Further, we perform an ablative study to evaluate the effect of each component.
Medical image segmentation is one of the major challenges addressed by machine learning methods. Yet, deep learning methods profoundly depend on a huge amount of annotated data which is time-consuming and costly. Though semi-supervised learning methods approach this problem by leveraging an abundant amount of unlabeled data along with a small amount of labeled data in the training process. Recently, MixUp regularizer [32] has been successfully introduced to semi-supervised learning methods showing superior performance [3]. MixUp augments the model with new data points through linear interpolation of the data at the input space. In this paper, we argue that this option is limited, instead, we propose ROAM, a random layer mixup, which encourages the network to be less confident for interpolated data points at randomly selected space. Hence, avoids over-fitting and enhances the generalization ability. We validate our method on publicly available datasets on whole-brain image segmentation (MALC) achieving state-of-the-art results in fully supervised (89.8%) and semi-supervised (87.2%) settings with relative improvement up to 2.75% and 16.73%, respectively.