University of Arkansas, USA
Abstract:Medical image classifiers detect gastrointestinal diseases well, but they do not explain their decisions. Large language models can generate clinical text, yet they struggle with visual reasoning and often produce unstable or incorrect explanations. This leaves a gap between what a model sees and the type of reasoning a clinician expects. We introduce a framework that links image classification with structured clinical reasoning. A new hybrid model, MobileCoAtNet, is designed for endoscopic images and achieves high accuracy across eight stomach-related classes. Its outputs are then used to drive reasoning by several LLMs. To judge this reasoning, we build two expert-verified benchmarks covering causes, symptoms, treatment, lifestyle, and follow-up care. Thirty-two LLMs are evaluated against these gold standards. Strong classification improves the quality of their explanations, but none of the models reach human-level stability. Even the best LLMs change their reasoning when prompts vary. Our study shows that combining DL with LLMs can produce useful clinical narratives, but current LLMs remain unreliable for high-stakes medical decisions. The framework provides a clearer view of their limits and a path for building safer reasoning systems. The complete source code and datasets used in this study are available at https://github.com/souravbasakshuvo/DL3M.




Abstract:In recent years, the integration of deep learning techniques into medical imaging has revolutionized the diagnosis and treatment of lung diseases, particularly in the context of COVID-19 and pneumonia. This paper presents a novel, lightweight deep learning based segmentation-classification network designed to enhance the detection and localization of lung infections using chest X-ray images. By leveraging the power of transfer learning with pre-trained VGG-16 weights, our model achieves robust performance even with limited training data. The architecture incorporates refined skip connections within the UNet++ framework, reducing semantic gaps and improving precision in segmentation tasks. Additionally, a classification module is integrated at the end of the encoder block, enabling simultaneous classification and segmentation. This dual functionality enhances the model's versatility, providing comprehensive diagnostic insights while optimizing computational efficiency. Experimental results demonstrate that our proposed lightweight network outperforms existing methods in terms of accuracy and computational requirements, making it a viable solution for real-time and resource constrained medical imaging applications. Furthermore, the streamlined design facilitates easier hyperparameter tuning and deployment on edge devices. This work underscores the potential of advanced deep learning architectures in improving clinical outcomes through precise and efficient medical image analysis. Our model achieved remarkable results with an Intersection over Union (IoU) of 93.59% and a Dice Similarity Coefficient (DSC) of 97.61% in lung area segmentation, and an IoU of 97.67% and a DSC of 87.61% for infection region localization. Additionally, it demonstrated high accuracy of 93.86% and sensitivity of 89.55% in detecting chest diseases, highlighting its efficacy and reliability.