Picture for Xuehai Tang

Xuehai Tang

Structured Security Auditing and Robustness Enhancement for Untrusted Agent Skills

Add code
Apr 28, 2026
Viaarxiv icon

RouteGuard: Internal-Signal Detection of Skill Poisoning in LLM Agents

Add code
Apr 24, 2026
Viaarxiv icon

FABLE: Fine-grained Fact Anchoring for Unstructured Model Editing

Add code
Apr 14, 2026
Viaarxiv icon

HearSay Benchmark: Do Audio LLMs Leak What They Hear?

Add code
Jan 07, 2026
Viaarxiv icon

Exploiting Synergistic Cognitive Biases to Bypass Safety in LLMs

Add code
Jul 30, 2025
Viaarxiv icon

Paper Summary Attack: Jailbreaking LLMs through LLM Safety Papers

Add code
Jul 17, 2025
Figure 1 for Paper Summary Attack: Jailbreaking LLMs through LLM Safety Papers
Figure 2 for Paper Summary Attack: Jailbreaking LLMs through LLM Safety Papers
Figure 3 for Paper Summary Attack: Jailbreaking LLMs through LLM Safety Papers
Figure 4 for Paper Summary Attack: Jailbreaking LLMs through LLM Safety Papers
Viaarxiv icon

LyapLock: Bounded Knowledge Preservation in Sequential Large Language Model Editing

Add code
May 21, 2025
Figure 1 for LyapLock: Bounded Knowledge Preservation in Sequential Large Language Model Editing
Figure 2 for LyapLock: Bounded Knowledge Preservation in Sequential Large Language Model Editing
Figure 3 for LyapLock: Bounded Knowledge Preservation in Sequential Large Language Model Editing
Figure 4 for LyapLock: Bounded Knowledge Preservation in Sequential Large Language Model Editing
Viaarxiv icon

The Dark Side of Trust: Authority Citation-Driven Jailbreak Attacks on Large Language Models

Add code
Nov 18, 2024
Figure 1 for The Dark Side of Trust: Authority Citation-Driven Jailbreak Attacks on Large Language Models
Figure 2 for The Dark Side of Trust: Authority Citation-Driven Jailbreak Attacks on Large Language Models
Figure 3 for The Dark Side of Trust: Authority Citation-Driven Jailbreak Attacks on Large Language Models
Figure 4 for The Dark Side of Trust: Authority Citation-Driven Jailbreak Attacks on Large Language Models
Viaarxiv icon

AdaPPA: Adaptive Position Pre-Fill Jailbreak Attack Approach Targeting LLMs

Add code
Sep 11, 2024
Figure 1 for AdaPPA: Adaptive Position Pre-Fill Jailbreak Attack Approach Targeting LLMs
Figure 2 for AdaPPA: Adaptive Position Pre-Fill Jailbreak Attack Approach Targeting LLMs
Figure 3 for AdaPPA: Adaptive Position Pre-Fill Jailbreak Attack Approach Targeting LLMs
Figure 4 for AdaPPA: Adaptive Position Pre-Fill Jailbreak Attack Approach Targeting LLMs
Viaarxiv icon

Enhancing Cross-Prompt Transferability in Vision-Language Models through Contextual Injection of Target Tokens

Add code
Jun 19, 2024
Figure 1 for Enhancing Cross-Prompt Transferability in Vision-Language Models through Contextual Injection of Target Tokens
Figure 2 for Enhancing Cross-Prompt Transferability in Vision-Language Models through Contextual Injection of Target Tokens
Figure 3 for Enhancing Cross-Prompt Transferability in Vision-Language Models through Contextual Injection of Target Tokens
Figure 4 for Enhancing Cross-Prompt Transferability in Vision-Language Models through Contextual Injection of Target Tokens
Viaarxiv icon