



Abstract:Test time adaptation (TTA) equips deep learning models to handle unseen test data that deviates from the training distribution, even when source data is inaccessible. While traditional TTA methods often rely on entropy as a confidence metric, its effectiveness can be limited, particularly in biased scenarios. Extending existing approaches like the Pseudo Label Probability Difference (PLPD), we introduce ETAGE, a refined TTA method that integrates entropy minimization with gradient norms and PLPD, to enhance sample selection and adaptation. Our method prioritizes samples that are less likely to cause instability by combining high entropy with high gradient norms out of adaptation, thus avoiding the overfitting to noise often observed in previous methods. Extensive experiments on CIFAR-10-C and CIFAR-100-C datasets demonstrate that our approach outperforms existing TTA techniques, particularly in challenging and biased scenarios, leading to more robust and consistent model performance across diverse test scenarios. The codebase for ETAGE is available on https://github.com/afsharshamsi/ETAGE.




Abstract:Polyp segmentation plays a crucial role in the early detection and diagnosis of colorectal cancer. However, obtaining accurate segmentations often requires labor-intensive annotations and specialized models. Recently, Meta AI Research released a general Segment Anything Model 2 (SAM 2), which has demonstrated promising performance in several segmentation tasks. In this work, we evaluate the performance of SAM 2 in segmenting polyps under various prompted settings. We hope this report will provide insights to advance the field of polyp segmentation and promote more interesting work in the future. This project is publicly available at https://github.com/ sajjad-sh33/Polyp-SAM-2.




Abstract:Computational complexity of Bayesian learning is impeding its adoption in practical, large-scale tasks. Despite demonstrations of significant merits such as improved robustness and resilience to unseen or out-of-distribution inputs over their non- Bayesian counterparts, their practical use has faded to near insignificance. In this study, we introduce an innovative framework to mitigate the computational burden of Bayesian neural networks (BNNs). Our approach follows the principle of Bayesian techniques based on deep ensembles, but significantly reduces their cost via multiple low-rank perturbations of parameters arising from a pre-trained neural network. Both vanilla version of ensembles as well as more sophisticated schemes such as Bayesian learning with Stein Variational Gradient Descent (SVGD), previously deemed impractical for large models, can be seamlessly implemented within the proposed framework, called Bayesian Low-Rank LeArning (Bella). In a nutshell, i) Bella achieves a dramatic reduction in the number of trainable parameters required to approximate a Bayesian posterior; and ii) it not only maintains, but in some instances, surpasses the performance of conventional Bayesian learning methods and non-Bayesian baselines. Our results with large-scale tasks such as ImageNet, CAMELYON17, DomainNet, VQA with CLIP, LLaVA demonstrate the effectiveness and versatility of Bella in building highly scalable and practical Bayesian deep models for real-world applications.




Abstract:Recent statistics indicate that approximately 1.3 billion individuals worldwide suffer from hypertension, a leading cause of premature death globally. Blood pressure (BP) serves as a critical health indicator for accurate and timely diagnosis and/or treatment of hypertension. Driven by recent advancements in Artificial Intelligence (AI) and Deep Neural Networks (DNNs), there has been a surge of interest in developing data-driven and cuff-less BP estimation solutions. In this context, current literature predominantly focuses on coupling Electrocardiography (ECG) and Photoplethysmography (PPG) sensors, though this approach is constrained by reliance on multiple sensor types. An alternative, utilizing standalone PPG signals, presents challenges due to the absence of auxiliary sensors (ECG), requiring the use of morphological features while addressing motion artifacts and high-frequency noise. To address these issues, the paper introduces the TransfoRhythm framework, a Transformer-based DNN architecture built upon the recently released physiological database, MIMIC-IV. Leveraging Multi-Head Attention (MHA) mechanism, TransfoRhythm identifies dependencies and similarities across data segments, forming a robust framework for cuff-less BP estimation solely using PPG signals. To our knowledge, this paper represents the first study to apply the MIMIC IV dataset for cuff-less BP estimation, and TransfoRhythm is the first MHA-based model trained via MIMIC IV for BP prediction. Performance evaluation through comprehensive experiments demonstrates TransfoRhythm's superiority over its state-of-the-art counterparts. Specifically, TransfoRhythm achieves highly accurate results with Root Mean Square Error (RMSE) of [1.84, 1.42] and Mean Absolute Error (MAE) of [1.50, 1.17] for systolic and diastolic blood pressures, respectively.




Abstract:Federated Learning (FL) has emerged as a prominent alternative to the traditional centralized learning approach. Generally speaking, FL is a decentralized approach that allows for collaborative training of Machine Learning (ML) models across multiple local nodes, ensuring data privacy and security while leveraging diverse datasets. Conventional FL, however, is susceptible to gradient inversion attacks, restrictively enforces a uniform architecture on local models, and suffers from model heterogeneity (model drift) due to non-IID local datasets. To mitigate some of these challenges, the new paradigm of Federated Knowledge Distillation (FKD) has emerged. FDK is developed based on the concept of Knowledge Distillation (KD), which involves extraction and transfer of a large and well-trained teacher model's knowledge to lightweight student models. FKD, however, still faces the model drift issue. Intuitively speaking, not all knowledge is universally beneficial due to the inherent diversity of data among local nodes. This calls for innovative mechanisms to evaluate the relevance and effectiveness of each client's knowledge for others, to prevent propagation of adverse knowledge. In this context, the paper proposes Effective Knowledge Fusion (KnFu) algorithm that evaluates knowledge of local models to only fuse semantic neighbors' effective knowledge for each client. The KnFu is a personalized effective knowledge fusion scheme for each client, that analyzes effectiveness of different local models' knowledge prior to the aggregation phase. Comprehensive experiments were performed on MNIST and CIFAR10 datasets illustrating effectiveness of the proposed KnFu in comparison to its state-of-the-art counterparts. A key conclusion of the work is that in scenarios with large and highly heterogeneous local datasets, local training could be preferable to knowledge fusion-based solutions.
Abstract:Familial Hypercholesterolemia (FH) is a genetic disorder characterized by elevated levels of Low-Density Lipoprotein (LDL) cholesterol or its associated genes. Early-stage and accurate categorization of FH is of significance allowing for timely interventions to mitigate the risk of life-threatening conditions. Conventional diagnosis approach, however, is complex, costly, and a challenging interpretation task even for experienced clinicians resulting in high underdiagnosis rates. Although there has been a recent surge of interest in using Machine Learning (ML) models for early FH detection, existing solutions only consider a binary classification task solely using classical ML models. Despite its significance, application of Deep Learning (DL) for FH detection is in its infancy, possibly, due to categorical nature of the underlying clinical data. The paper addresses this gap by introducing the FH-TabNet, which is a multi-stage tabular DL network for multi-class (Definite, Probable, Possible, and Unlikely) FH detection. The FH-TabNet initially involves applying a deep tabular data learning architecture (TabNet) for primary categorization into healthy (Possible/Unlikely) and patient (Probable/Definite) classes. Subsequently, independent TabNet classifiers are applied to each subgroup, enabling refined classification. The model's performance is evaluated through 5-fold cross-validation illustrating superior performance in categorizing FH patients, particularly in the challenging low-prevalence subcategories.




Abstract:Digital pathology involves converting physical tissue slides into high-resolution Whole Slide Images (WSIs), which pathologists analyze for disease-affected tissues. However, large histology slides with numerous microscopic fields pose challenges for visual search. To aid pathologists, Computer Aided Diagnosis (CAD) systems offer visual assistance in efficiently examining WSIs and identifying diagnostically relevant regions. This paper presents a novel histopathological image analysis method employing Weakly Supervised Semantic Segmentation (WSSS) based on Capsule Networks, the first such application. The proposed model is evaluated using the Atlas of Digital Pathology (ADP) dataset and its performance is compared with other histopathological semantic segmentation methodologies. The findings underscore the potential of Capsule Networks in enhancing the precision and efficiency of histopathological image analysis. Experimental results show that the proposed model outperforms traditional methods in terms of accuracy and the mean Intersection-over-Union (mIoU) metric.




Abstract:This paper addresses the challenge of mitigating data heterogeneity among clients within a Federated Learning (FL) framework. The model-drift issue, arising from the noniid nature of client data, often results in suboptimal personalization of a global model compared to locally trained models for each client. To tackle this challenge, we propose a novel approach named FedD2S for Personalized Federated Learning (pFL), leveraging knowledge distillation. FedD2S incorporates a deep-to-shallow layer-dropping mechanism in the data-free knowledge distillation process to enhance local model personalization. Through extensive simulations on diverse image datasets-FEMNIST, CIFAR10, CINIC0, and CIFAR100-we compare FedD2S with state-of-the-art FL baselines. The proposed approach demonstrates superior performance, characterized by accelerated convergence and improved fairness among clients. The introduced layer-dropping technique effectively captures personalized knowledge, resulting in enhanced performance compared to alternative FL models. Moreover, we investigate the impact of key hyperparameters, such as the participation ratio and layer-dropping rate, providing valuable insights into the optimal configuration for FedD2S. The findings demonstrate the efficacy of adaptive layer-dropping in the knowledge distillation process to achieve enhanced personalization and performance across diverse datasets and tasks.
Abstract:Drawing inspiration from the primate brain's intriguing evidence accumulation process, and guided by models from cognitive psychology and neuroscience, the paper introduces the NYCTALE framework, a neuro-inspired and evidence accumulation-based Transformer architecture. The proposed neuro-inspired NYCTALE offers a novel pathway in the domain of Personalized Medicine (PM) for lung cancer diagnosis. In nature, Nyctales are small owls known for their nocturnal behavior, hunting primarily during the darkness of night. The NYCTALE operates in a similarly vigilant manner, i.e., processing data in an evidence-based fashion and making predictions dynamically/adaptively. Distinct from conventional Computed Tomography (CT)-based Deep Learning (DL) models, the NYCTALE performs predictions only when sufficient amount of evidence is accumulated. In other words, instead of processing all or a pre-defined subset of CT slices, for each person, slices are provided one at a time. The NYCTALE framework then computes an evidence vector associated with contribution of each new CT image. A decision is made once the total accumulated evidence surpasses a specific threshold. Preliminary experimental analyses conducted using a challenging in-house dataset comprising 114 subjects. The results are noteworthy, suggesting that NYCTALE outperforms the benchmark accuracy even with approximately 60% less training data on this demanding and small dataset.




Abstract:Mobile Edge Caching (MEC) integrated with Deep Neural Networks (DNNs) is an innovative technology with significant potential for the future generation of wireless networks, resulting in a considerable reduction in users' latency. The MEC network's effectiveness, however, heavily relies on its capacity to predict and dynamically update the storage of caching nodes with the most popular contents. To be effective, a DNN-based popularity prediction model needs to have the ability to understand the historical request patterns of content, including their temporal and spatial correlations. Existing state-of-the-art time-series DNN models capture the latter by simultaneously inputting the sequential request patterns of multiple contents to the network, considerably increasing the size of the input sample. This motivates us to address this challenge by proposing a DNN-based popularity prediction framework based on the idea of contrasting input samples against each other, designed for the Unmanned Aerial Vehicle (UAV)-aided MEC networks. Referred to as the Contrastive Learning-based Survival Analysis (CLSA), the proposed architecture consists of a self-supervised Contrastive Learning (CL) model, where the temporal information of sequential requests is learned using a Long Short Term Memory (LSTM) network as the encoder of the CL architecture. Followed by a Survival Analysis (SA) network, the output of the proposed CLSA architecture is probabilities for each content's future popularity, which are then sorted in descending order to identify the Top-K popular contents. Based on the simulation results, the proposed CLSA architecture outperforms its counterparts across the classification accuracy and cache-hit ratio.