Picture for Xuanli He

Xuanli He

GRADA: Graph-based Reranker against Adversarial Documents Attack

Add code
May 12, 2025
Viaarxiv icon

Defending Deep Neural Networks against Backdoor Attacks via Module Switching

Add code
Apr 08, 2025
Viaarxiv icon

Self-Training Large Language Models for Tool-Use Without Demonstrations

Add code
Feb 09, 2025
Viaarxiv icon

Cut the Deadwood Out: Post-Training Model Purification with Selective Module Substitution

Add code
Dec 29, 2024
Viaarxiv icon

An Auditing Test To Detect Behavioral Shift in Language Models

Add code
Oct 25, 2024
Figure 1 for An Auditing Test To Detect Behavioral Shift in Language Models
Figure 2 for An Auditing Test To Detect Behavioral Shift in Language Models
Figure 3 for An Auditing Test To Detect Behavioral Shift in Language Models
Figure 4 for An Auditing Test To Detect Behavioral Shift in Language Models
Viaarxiv icon

Analysing the Residual Stream of Language Models Under Knowledge Conflicts

Add code
Oct 21, 2024
Figure 1 for Analysing the Residual Stream of Language Models Under Knowledge Conflicts
Figure 2 for Analysing the Residual Stream of Language Models Under Knowledge Conflicts
Figure 3 for Analysing the Residual Stream of Language Models Under Knowledge Conflicts
Figure 4 for Analysing the Residual Stream of Language Models Under Knowledge Conflicts
Viaarxiv icon

Are We Done with MMLU?

Add code
Jun 07, 2024
Figure 1 for Are We Done with MMLU?
Figure 2 for Are We Done with MMLU?
Figure 3 for Are We Done with MMLU?
Figure 4 for Are We Done with MMLU?
Viaarxiv icon

IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models

Add code
Jun 05, 2024
Figure 1 for IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models
Figure 2 for IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models
Figure 3 for IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models
Figure 4 for IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models
Viaarxiv icon

SEEP: Training Dynamics Grounds Latent Representation Search for Mitigating Backdoor Poisoning Attacks

Add code
May 19, 2024
Viaarxiv icon

Transferring Troubles: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning

Add code
Apr 30, 2024
Figure 1 for Transferring Troubles: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning
Figure 2 for Transferring Troubles: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning
Figure 3 for Transferring Troubles: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning
Figure 4 for Transferring Troubles: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning
Viaarxiv icon