Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiaoyang Wang

Balancing Fairness and Performance in Healthcare AI: A Gradient Reconciliation Approach

Apr 19, 2025

Xiaoyang Wang, Christopher C. Yang

Figure 1 for Balancing Fairness and Performance in Healthcare AI: A Gradient Reconciliation Approach

Figure 2 for Balancing Fairness and Performance in Healthcare AI: A Gradient Reconciliation Approach

Figure 3 for Balancing Fairness and Performance in Healthcare AI: A Gradient Reconciliation Approach

Figure 4 for Balancing Fairness and Performance in Healthcare AI: A Gradient Reconciliation Approach

Abstract:The rapid growth of healthcare data and advances in computational power have accelerated the adoption of artificial intelligence (AI) in medicine. However, AI systems deployed without explicit fairness considerations risk exacerbating existing healthcare disparities, potentially leading to inequitable resource allocation and diagnostic disparities across demographic subgroups. To address this challenge, we propose FairGrad, a novel gradient reconciliation framework that automatically balances predictive performance and multi-attribute fairness optimization in healthcare AI models. Our method resolves conflicting optimization objectives by projecting each gradient vector onto the orthogonal plane of the others, thereby regularizing the optimization trajectory to ensure equitable consideration of all objectives. Evaluated on diverse real-world healthcare datasets and predictive tasks - including Substance Use Disorder (SUD) treatment and sepsis mortality - FairGrad achieved statistically significant improvements in multi-attribute fairness metrics (e.g., equalized odds) while maintaining competitive predictive accuracy. These results demonstrate the viability of harmonizing fairness and utility in mission-critical medical AI applications.

* Accepted to the 23rd International Conference on Artificial Intelligence in Medicine (AIME 2025)

Via

Access Paper or Ask Questions

DeepSelective: Feature Gating and Representation Matching for Interpretable Clinical Prediction

Apr 15, 2025

Ruochi Zhang, Qian Yang, Xiaoyang Wang, Haoran Wu, Qiong Zhou, Yu Wang, Kewei Li, Yueying Wang, Yusi Fan, Jiale Zhang(+3 more)

Figure 1 for DeepSelective: Feature Gating and Representation Matching for Interpretable Clinical Prediction

Figure 2 for DeepSelective: Feature Gating and Representation Matching for Interpretable Clinical Prediction

Figure 3 for DeepSelective: Feature Gating and Representation Matching for Interpretable Clinical Prediction

Figure 4 for DeepSelective: Feature Gating and Representation Matching for Interpretable Clinical Prediction

Abstract:The rapid accumulation of Electronic Health Records (EHRs) has transformed healthcare by providing valuable data that enhance clinical predictions and diagnoses. While conventional machine learning models have proven effective, they often lack robust representation learning and depend heavily on expert-crafted features. Although deep learning offers powerful solutions, it is often criticized for its lack of interpretability. To address these challenges, we propose DeepSelective, a novel end to end deep learning framework for predicting patient prognosis using EHR data, with a strong emphasis on enhancing model interpretability. DeepSelective combines data compression techniques with an innovative feature selection approach, integrating custom-designed modules that work together to improve both accuracy and interpretability. Our experiments demonstrate that DeepSelective not only enhances predictive accuracy but also significantly improves interpretability, making it a valuable tool for clinical decision-making. The source code is freely available at http://www.healthinformaticslab.org/supp/resources.php .

Via

Access Paper or Ask Questions

Generative Active Adaptation for Drifting and Imbalanced Network Intrusion Detection

Mar 04, 2025

Ragini Gupta, Shinan Liu, Ruixiao Zhang, Xinyue Hu, Pranav Kommaraju, Xiaoyang Wang, Hadjer Benkraouda, Nick Feamster, Klara Nahrstedt

Figure 1 for Generative Active Adaptation for Drifting and Imbalanced Network Intrusion Detection

Figure 2 for Generative Active Adaptation for Drifting and Imbalanced Network Intrusion Detection

Figure 3 for Generative Active Adaptation for Drifting and Imbalanced Network Intrusion Detection

Figure 4 for Generative Active Adaptation for Drifting and Imbalanced Network Intrusion Detection

Abstract:Machine learning has shown promise in network intrusion detection systems, yet its performance often degrades due to concept drift and imbalanced data. These challenges are compounded by the labor-intensive process of labeling network traffic, especially when dealing with evolving and rare attack types, which makes selecting the right data for adaptation difficult. To address these issues, we propose a generative active adaptation framework that minimizes labeling effort while enhancing model robustness. Our approach employs density-aware active sampling to identify the most informative samples for annotation and leverages deep generative models to synthesize diverse samples, thereby augmenting the training set and mitigating the effects of concept drift. We evaluate our end-to-end framework on both simulated IDS data and a real-world ISP dataset, demonstrating significant improvements in intrusion detection performance. Our method boosts the overall F1-score from 0.60 (without adaptation) to 0.86. Rare attacks such as Infiltration, Web Attack, and FTP-BruteForce, which originally achieve F1 scores of 0.001, 0.04, and 0.00, improve to 0.30, 0.50, and 0.71, respectively, with generative active adaptation in the CIC-IDS 2018 dataset. Our framework effectively enhances rare attack detection while reducing labeling costs, making it a scalable and adaptive solution for real-world intrusion detection.

Via

Access Paper or Ask Questions

OpenCharacter: Training Customizable Role-Playing LLMs with Large-Scale Synthetic Personas

Jan 26, 2025

Xiaoyang Wang, Hongming Zhang, Tao Ge, Wenhao Yu, Dian Yu, Dong Yu

Figure 1 for OpenCharacter: Training Customizable Role-Playing LLMs with Large-Scale Synthetic Personas

Figure 2 for OpenCharacter: Training Customizable Role-Playing LLMs with Large-Scale Synthetic Personas

Figure 3 for OpenCharacter: Training Customizable Role-Playing LLMs with Large-Scale Synthetic Personas

Figure 4 for OpenCharacter: Training Customizable Role-Playing LLMs with Large-Scale Synthetic Personas

Abstract:Customizable role-playing in large language models (LLMs), also known as character generalization, is gaining increasing attention for its versatility and cost-efficiency in developing and deploying role-playing dialogue agents. This study explores a large-scale data synthesis approach to equip LLMs with character generalization capabilities. We begin by synthesizing large-scale character profiles using personas from Persona Hub and then explore two strategies: response rewriting and response generation, to create character-aligned instructional responses. To validate the effectiveness of our synthetic instruction tuning data for character generalization, we perform supervised fine-tuning (SFT) using the LLaMA-3 8B model. Our best-performing model strengthens the original LLaMA-3 8B Instruct model and achieves performance comparable to GPT-4o models on role-playing dialogue. We release our synthetic characters and instruction-tuning dialogues to support public research.

Via

Access Paper or Ask Questions

CNC: Cross-modal Normality Constraint for Unsupervised Multi-class Anomaly Detection

Dec 31, 2024

Xiaolei Wang, Xiaoyang Wang, Huihui Bai, Eng Gee Lim, Jimin Xiao

Figure 1 for CNC: Cross-modal Normality Constraint for Unsupervised Multi-class Anomaly Detection

Figure 2 for CNC: Cross-modal Normality Constraint for Unsupervised Multi-class Anomaly Detection

Figure 3 for CNC: Cross-modal Normality Constraint for Unsupervised Multi-class Anomaly Detection

Figure 4 for CNC: Cross-modal Normality Constraint for Unsupervised Multi-class Anomaly Detection

Abstract:Existing unsupervised distillation-based methods rely on the differences between encoded and decoded features to locate abnormal regions in test images. However, the decoder trained only on normal samples still reconstructs abnormal patch features well, degrading performance. This issue is particularly pronounced in unsupervised multi-class anomaly detection tasks. We attribute this behavior to over-generalization(OG) of decoder: the significantly increasing diversity of patch patterns in multi-class training enhances the model generalization on normal patches, but also inadvertently broadens its generalization to abnormal patches. To mitigate OG, we propose a novel approach that leverages class-agnostic learnable prompts to capture common textual normality across various visual patterns, and then apply them to guide the decoded features towards a normal textual representation, suppressing over-generalization of the decoder on abnormal patterns. To further improve performance, we also introduce a gated mixture-of-experts module to specialize in handling diverse patch patterns and reduce mutual interference between them in multi-class training. Our method achieves competitive performance on the MVTec AD and VisA datasets, demonstrating its effectiveness.

* Accepted by AAAI 2025

Via

Access Paper or Ask Questions

Adapting Unsigned Graph Neural Networks for Signed Graphs: A Few-Shot Prompt Tuning Approach

Dec 11, 2024

Zian Zhai, Sima Qing, Xiaoyang Wang, Wenjie Zhang

Figure 1 for Adapting Unsigned Graph Neural Networks for Signed Graphs: A Few-Shot Prompt Tuning Approach

Figure 2 for Adapting Unsigned Graph Neural Networks for Signed Graphs: A Few-Shot Prompt Tuning Approach

Figure 3 for Adapting Unsigned Graph Neural Networks for Signed Graphs: A Few-Shot Prompt Tuning Approach

Figure 4 for Adapting Unsigned Graph Neural Networks for Signed Graphs: A Few-Shot Prompt Tuning Approach

Abstract:Signed Graph Neural Networks (SGNNs) are powerful tools for signed graph representation learning but struggle with limited generalization and heavy dependence on labeled data. While recent advancements in "graph pre-training and prompt tuning" have reduced label dependence in Graph Neural Networks (GNNs) and improved their generalization abilities by leveraging pre-training knowledge, these efforts have focused exclusively on unsigned graphs. The scarcity of publicly available signed graph datasets makes it essential to transfer knowledge from unsigned graphs to signed graph tasks. However, this transfer introduces significant challenges due to the graph-level and task-level divergences between the pre-training and downstream phases. To address these challenges, we propose Signed Graph Prompt Tuning (SGPT) in this paper. Specifically, SGPT employs a graph template and a semantic prompt to segregate mixed link semantics in the signed graph and then adaptively integrate the distinctive semantic information according to the needs of downstream tasks, thereby unifying the pre-training and downstream graphs. Additionally, SGPT utilizes a task template and a feature prompt to reformulate the downstream signed graph tasks, aligning them with pre-training tasks to ensure a unified optimization objective and consistent feature space across tasks. Finally, extensive experiments are conducted on popular signed graph datasets, demonstrating the superiority of SGPT over state-of-the-art methods.

Via

Access Paper or Ask Questions

Efficient Dynamic Attributed Graph Generation

Dec 11, 2024

Fan Li, Xiaoyang Wang, Dawei Cheng, Cong Chen, Ying Zhang, Xuemin Lin

Figure 1 for Efficient Dynamic Attributed Graph Generation

Figure 2 for Efficient Dynamic Attributed Graph Generation

Figure 3 for Efficient Dynamic Attributed Graph Generation

Figure 4 for Efficient Dynamic Attributed Graph Generation

Abstract:Data generation is a fundamental research problem in data management due to its diverse use cases, ranging from testing database engines to data-specific applications. However, real-world entities often involve complex interactions that cannot be effectively modeled by traditional tabular data. Therefore, graph data generation has attracted increasing attention recently. Although various graph generators have been proposed in the literature, there are three limitations: i) They cannot capture the co-evolution pattern of graph structure and node attributes. ii) Few of them consider edge direction, leading to substantial information loss. iii) Current state-of-the-art dynamic graph generators are based on the temporal random walk, making the simulation process time-consuming. To fill the research gap, we introduce VRDAG, a novel variational recurrent framework for efficient dynamic attributed graph generation. Specifically, we design a bidirectional message-passing mechanism to encode both directed structural knowledge and attribute information of a snapshot. Then, the temporal dependency in the graph sequence is captured by a recurrence state updater, generating embeddings that can preserve the evolution pattern of early graphs. Based on the hidden node embeddings, a conditional variational Bayesian method is developed to sample latent random variables at the neighboring timestep for new snapshot generation. The proposed generation paradigm avoids the time-consuming path sampling and merging process in existing random walk-based methods, significantly reducing the synthesis time. Finally, comprehensive experiments on real-world datasets are conducted to demonstrate the effectiveness and efficiency of the proposed model.

* 14 pages,10 figures. Accepted by IEEE ICDE2025

Via

Access Paper or Ask Questions

Paths-over-Graph: Knowledge Graph Empowered Large Language Model Reasoning

Oct 21, 2024

Xingyu Tan, Xiaoyang Wang, Qing Liu, Xiwei Xu, Xin Yuan, Wenjie Zhang

Figure 1 for Paths-over-Graph: Knowledge Graph Empowered Large Language Model Reasoning

Figure 2 for Paths-over-Graph: Knowledge Graph Empowered Large Language Model Reasoning

Figure 3 for Paths-over-Graph: Knowledge Graph Empowered Large Language Model Reasoning

Figure 4 for Paths-over-Graph: Knowledge Graph Empowered Large Language Model Reasoning

Abstract:Large Language Models (LLMs) have achieved impressive results in various tasks but struggle with hallucination problems and lack of relevant knowledge, especially in deep complex reasoning and knowledge-intensive tasks. Knowledge Graphs (KGs), which capture vast amounts of facts in a structured format, offer a reliable source of knowledge for reasoning. However, existing KG-based LLM reasoning methods face challenges like handling multi-hop reasoning, multi-entity questions, and effectively utilizing graph structures. To address these issues, we propose Paths-over-Graph (PoG), a novel method that enhances LLM reasoning by integrating knowledge reasoning paths from KGs, improving the interpretability and faithfulness of LLM outputs. PoG tackles multi-hop and multi-entity questions through a three-phase dynamic multi-hop path exploration, which combines the inherent knowledge of LLMs with factual knowledge from KGs. In order to improve the efficiency, PoG prunes irrelevant information from the graph exploration first and introduces efficient three-step pruning techniques that incorporate graph structures, LLM prompting, and a pre-trained language model (e.g., SBERT) to effectively narrow down the explored candidate paths. This ensures all reasoning paths contain highly relevant information captured from KGs, making the reasoning faithful and interpretable in problem-solving. PoG innovatively utilizes graph structure to prune the irrelevant noise and represents the first method to implement multi-entity deep path detection on KGs for LLM reasoning tasks. Comprehensive experiments on five benchmark KGQA datasets demonstrate PoG outperforms the state-of-the-art method ToG across GPT-3.5-Turbo and GPT-4, achieving an average accuracy improvement of 18.9%. Notably, PoG with GPT-3.5-Turbo surpasses ToG with GPT-4 by up to 23.9%.

Via

Access Paper or Ask Questions

Router-Tuning: A Simple and Effective Approach for Enabling Dynamic-Depth in Transformers

Oct 17, 2024

Shwai He, Tao Ge, Guoheng Sun, Bowei Tian, Xiaoyang Wang, Ang Li, Dong Yu

Abstract:Traditional transformer models often allocate a fixed amount of computational resources to every input token, leading to inefficient and unnecessary computation. To address this, the Mixture of Depths (MoD) was introduced to dynamically adjust the computational depth by skipping less important layers. Despite its promise, current MoD approaches remain under-explored and face two main challenges: (1) \textit{high training costs due to the need to train the entire model along with the routers that determine which layers to skip}, and (2) \textit{the risk of performance degradation when important layers are bypassed}. In response to the first issue, we propose Router-Tuning, a method that fine-tunes only the router on a small dataset, drastically reducing the computational overhead associated with full model training. For the second challenge, we propose MindSkip, which deploys \textit{Attention with Dynamic Depths}. This method preserves the model's performance while significantly enhancing computational and memory efficiency. Extensive experiments demonstrate that our approach delivers competitive results while dramatically improving the computation efficiency, e.g., 21\% speedup and only a 0.2\% performance drop. The code is released at \url{https://github.com/CASE-Lab-UMD/Router-Tuning}.

Via

Access Paper or Ask Questions

TCGU: Data-centric Graph Unlearning based on Transferable Condensation

Oct 09, 2024

Fan Li, Xiaoyang Wang, Dawei Cheng, Wenjie Zhang, Ying Zhang, Xuemin Lin

Figure 1 for TCGU: Data-centric Graph Unlearning based on Transferable Condensation

Figure 2 for TCGU: Data-centric Graph Unlearning based on Transferable Condensation

Figure 3 for TCGU: Data-centric Graph Unlearning based on Transferable Condensation

Figure 4 for TCGU: Data-centric Graph Unlearning based on Transferable Condensation

Abstract:With growing demands for data privacy and model robustness, graph unlearning (GU), which erases the influence of specific data on trained GNN models, has gained significant attention. However, existing exact unlearning methods suffer from either low efficiency or poor model performance. While being more utility-preserving and efficient, current approximate unlearning methods are not applicable in the zero-glance privacy setting, where the deleted samples cannot be accessed during unlearning due to immediate deletion requested by regulations. Besides, these approximate methods, which try to directly perturb model parameters still involve high privacy concerns in practice. To fill the gap, we propose Transferable Condensation Graph Unlearning (TCGU), a data-centric solution to zero-glance graph unlearning. Specifically, we first design a two-level alignment strategy to pre-condense the original graph into a small yet utility-preserving dataset. Upon receiving an unlearning request, we fine-tune the pre-condensed data with a low-rank plugin, to directly align its distribution with the remaining graph, thus efficiently revoking the information of deleted data without accessing them. A novel similarity distribution matching approach and a discrimination regularizer are proposed to effectively transfer condensed data and preserve its utility in GNN training, respectively. Finally, we retrain the GNN on the transferred condensed data. Extensive experiments on 6 benchmark datasets demonstrate that TCGU can achieve superior performance in terms of model utility, unlearning efficiency, and unlearning efficacy than existing GU methods.

* 14 pages, 18 figures

Via

Access Paper or Ask Questions