Precision medicine tailored to individual patients has gained significant attention in recent times. Machine learning techniques are now employed to process personalized data from various sources, including images, genetics, and assessments. These techniques have demonstrated good outcomes in many clinical prediction tasks. Notably, the approach of constructing graphs by linking similar patients and then applying graph neural networks (GNNs) stands out, because related information from analogous patients are aggregated and considered for prediction. However, selecting the appropriate edge feature to define patient similarity and construct the graph is challenging, given that each patient is depicted by high-dimensional features from diverse sources. Previous studies rely on human expertise to select the edge feature, which is neither scalable nor efficient in pinpointing crucial edge features for complex diseases. In this paper, we propose a novel algorithm named \ours, which can automatically select important features to construct multiple patient similarity graphs, and train GNNs based on these graphs as weak learners in adaptive boosting. \ours{} is evaluated on two real-world medical scenarios and shows superiors performance.
Breast cancer has reached the highest incidence rate worldwide among all malignancies since 2020. Breast imaging plays a significant role in early diagnosis and intervention to improve the outcome of breast cancer patients. In the past decade, deep learning has shown remarkable progress in breast cancer imaging analysis, holding great promise in interpreting the rich information and complex context of breast imaging modalities. Considering the rapid improvement in the deep learning technology and the increasing severity of breast cancer, it is critical to summarize past progress and identify future challenges to be addressed. In this paper, we provide an extensive survey of deep learning-based breast cancer imaging research, covering studies on mammogram, ultrasound, magnetic resonance imaging, and digital pathology images over the past decade. The major deep learning methods, publicly available datasets, and applications on imaging-based screening, diagnosis, treatment response prediction, and prognosis are described in detail. Drawn from the findings of this survey, we present a comprehensive discussion of the challenges and potential avenues for future research in deep learning-based breast cancer imaging.
As deep neural networks include a high number of parameters and operations, it can be a challenge to implement these models on devices with limited computational resources. Despite the development of novel pruning methods toward resource-efficient models, it has become evident that these models are not capable of handling "imbalanced" and "limited number of data points". With input and output information, along with the values of the filters, a novel filter pruning method is proposed. Our pruning method considers the fact that all information about the importance of a filter may not be reflected in the value of the filter. Instead, it is reflected in the changes made to the data after the filter is applied to it. In this work, three methods are compared with the same training conditions except for the ranking of each method. We demonstrated that our model performed significantly better than other methods for medical datasets which are inherently imbalanced. When we removed up to 58% of FLOPs for the IDRID dataset and up to 45% for the ISIC dataset, our model was able to yield an equivalent (or even superior) result to the baseline model while other models were unable to achieve similar results. To evaluate FLOP and parameter reduction using our model in real-world settings, we built a smartphone app, where we demonstrated a reduction of up to 79% in memory usage and 72% in prediction time. All codes and parameters for training different models are available at https://github.com/mohofar/Beta-Rank
Learning medical visual representations directly from paired radiology reports has become an emerging topic in representation learning. However, existing medical image-text joint learning methods are limited by instance or local supervision analysis, ignoring disease-level semantic correspondences. In this paper, we present a novel Multi-Granularity Cross-modal Alignment (MGCA) framework for generalized medical visual representation learning by harnessing the naturally exhibited semantic correspondences between medical image and radiology reports at three different levels, i.e., pathological region-level, instance-level, and disease-level. Specifically, we first incorporate the instance-wise alignment module by maximizing the agreement between image-report pairs. Further, for token-wise alignment, we introduce a bidirectional cross-attention strategy to explicitly learn the matching between fine-grained visual tokens and text tokens, followed by contrastive learning to align them. More important, to leverage the high-level inter-subject relationship semantic (e.g., disease) correspondences, we design a novel cross-modal disease-level alignment paradigm to enforce the cross-modal cluster assignment consistency. Extensive experimental results on seven downstream medical image datasets covering image classification, object detection, and semantic segmentation tasks demonstrate the stable and superior performance of our framework.
Deep learning has demonstrated radiograph screening performances that are comparable or superior to radiologists. However, recent studies show that deep models for thoracic disease classification usually show degraded performance when applied to external data. Such phenomena can be categorized into shortcut learning, where the deep models learn unintended decision rules that can fit the identically distributed training and test set but fail to generalize to other distributions. A natural way to alleviate this defect is explicitly indicating the lesions and focusing the model on learning the intended features. In this paper, we conduct extensive retrospective experiments to compare a popular thoracic disease classification model, CheXNet, and a thoracic lesion detection model, CheXDet. We first showed that the two models achieved similar image-level classification performance on the internal test set with no significant differences under many scenarios. Meanwhile, we found incorporating external training data even led to performance degradation for CheXNet. Then, we compared the models' internal performance on the lesion localization task and showed that CheXDet achieved significantly better performance than CheXNet even when given 80% less training data. By further visualizing the models' decision-making regions, we revealed that CheXNet learned patterns other than the target lesions, demonstrating its shortcut learning defect. Moreover, CheXDet achieved significantly better external performance than CheXNet on both the image-level classification task and the lesion localization task. Our findings suggest improving annotation granularity for training deep learning systems as a promising way to elevate future deep learning-based diagnosis systems for clinical usage.
While the field of deep tomographic reconstruction has been advancing rapidly since 2016, there are constant debates and major challenges with the recently published PNAS paper on instabilities of deep learning in image reconstruction as a primary example, in which three kinds of unstable phenomena are demonstrated: (1) tiny perturbation on input generating strong output artifacts, (2) small structural features going undetected, and (3) increased input data leading to decreased performance. In this article, we show that key algorithmic ingredients of analytic inversion, compressed sensing, iterative reconstruction, and deep learning can be synergized to stabilize deep neural networks for optimal tomographic image reconstruction. With the same or similar datasets used in the PNAS paper and relative to the same state of the art compressed sensing algorithm, our proposed analytic, compressed, iterative deep (ACID) network produces superior imaging performance that are both accurate and robust with respect to noise, under adversarial attack, and as the number of input data is increased. We believe that deep tomographic reconstruction networks can be designed to produce accurate and robust results, improve clinical and other important applications, and eventually dominate the tomographic imaging field.