Abstract:Diabetic foot ulcer (DFU) detection is a clinically significant yet challenging task due to the scarcity and variability of publicly available datasets. To solve these problems, we propose ConMatFormer, a new hybrid deep learning architecture that combines ConvNeXt blocks, multiple attention mechanisms convolutional block attention module (CBAM) and dual attention network (DANet), and transformer modules in a way that works together. This design facilitates the extraction of better local features and understanding of the global context, which allows us to model small skin patterns across different types of DFU very accurately. To address the class imbalance, we used data augmentation methods. A ConvNeXt block was used to obtain detailed local features in the initial stages. Subsequently, we compiled the model by adding a transformer module to enhance long-range dependency. This enabled us to pinpoint the DFU classes that were underrepresented or constituted minorities. Tests on the DS1 (DFUC2021) and DS2 (diabetic foot ulcer (DFU)) datasets showed that ConMatFormer outperformed state-of-the-art (SOTA) convolutional neural network (CNN) and Vision Transformer (ViT) models in terms of accuracy, reliability, and flexibility. The proposed method achieved an accuracy of 0.8961 and a precision of 0.9160 in a single experiment, which is a significant improvement over the current standards for classifying DFUs. In addition, by 4-fold cross-validation, the proposed model achieved an accuracy of 0.9755 with a standard deviation of only 0.0031. We further applied explainable artificial intelligence (XAI) methods, such as Grad-CAM, Grad-CAM++, and LIME, to consistently monitor the transparency and trustworthiness of the decision-making process.. Our findings set a new benchmark for DFU classification and provide a hybrid attention transformer framework for medical image analysis.




Abstract:Efforts on the research and development of OCR systems for Low-Resource Languages are relatively new. Low-resource languages have little training data available for training Machine Translation systems or other systems. Even though a vast amount of text has been digitized and made available on the internet the text is still in PDF and Image format, which are not instantly accessible. This paper discusses text recognition for two scripts: Bengali and Nepali; there are about 300 and 40 million Bengali and Nepali speakers respectively. In this study, using encoder-decoder transformers, a model was developed, and its efficacy was assessed using a collection of optical text images, both handwritten and printed. The results signify that the suggested technique corresponds with current approaches and achieves high precision in recognizing text in Bengali and Nepali. This study can pave the way for the advanced and accessible study of linguistics in South East Asia.




Abstract:Accurate disease categorization using endoscopic images is a significant problem in Gastroenterology. This paper describes a technique for assisting medical diagnosis procedures and identifying gastrointestinal tract disorders based on the categorization of characteristics taken from endoscopic pictures using a vision transformer and transfer learning model. Vision transformer has shown very promising results on difficult image classification tasks. In this paper, we have suggested a vision transformer based approach to detect gastrointestianl diseases from wireless capsule endoscopy (WCE) curated images of colon with an accuracy of 95.63\%. We have compared this transformer based approach with pretrained convolutional neural network (CNN) model DenseNet201 and demonstrated that vision transformer surpassed DenseNet201 in various quantitative performance evaluation metrics.




Abstract:Polycystic Ovary Syndrome (PCOS) is an endrocrinological dysfunction prevalent among women of reproductive age. PCOS is a combination of syndromes caused by an excess of androgens - a group of sex hormones - in women. Syndromes including acne, alopecia, hirsutism, hyperandrogenaemia, oligo-ovulation, etc. are caused by PCOS. It is also a major cause of female infertility. An estimated 15% of reproductive-aged women are affected by PCOS globally. The necessity of detecting PCOS early due to the severity of its deleterious effects cannot be overstated. In this paper, we have developed PCONet - a Convolutional Neural Network (CNN) - to detect polycistic ovary from ovarian ultrasound images. We have also fine tuned InceptionV3 - a pretrained convolutional neural network of 45 layers - by utilizing the transfer learning method to classify polcystic ovarian ultrasound images. We have compared these two models on various quantitative performance evaluation parameters and demonstrated that PCONet is the superior one among these two with an accuracy of 98.12%, whereas the fine tuned InceptionV3 showcased an accuracy of 96.56% on test images.