Picture for Yige Li

Yige Li

X-Transfer Attacks: Towards Super Transferable Adversarial Attacks on CLIP

Add code
May 08, 2025
Figure 1 for X-Transfer Attacks: Towards Super Transferable Adversarial Attacks on CLIP
Figure 2 for X-Transfer Attacks: Towards Super Transferable Adversarial Attacks on CLIP
Figure 3 for X-Transfer Attacks: Towards Super Transferable Adversarial Attacks on CLIP
Figure 4 for X-Transfer Attacks: Towards Super Transferable Adversarial Attacks on CLIP
Viaarxiv icon

Propaganda via AI? A Study on Semantic Backdoors in Large Language Models

Add code
Apr 15, 2025
Figure 1 for Propaganda via AI? A Study on Semantic Backdoors in Large Language Models
Figure 2 for Propaganda via AI? A Study on Semantic Backdoors in Large Language Models
Figure 3 for Propaganda via AI? A Study on Semantic Backdoors in Large Language Models
Figure 4 for Propaganda via AI? A Study on Semantic Backdoors in Large Language Models
Viaarxiv icon

A Practical Memory Injection Attack against LLM Agents

Add code
Mar 05, 2025
Figure 1 for A Practical Memory Injection Attack against LLM Agents
Figure 2 for A Practical Memory Injection Attack against LLM Agents
Figure 3 for A Practical Memory Injection Attack against LLM Agents
Figure 4 for A Practical Memory Injection Attack against LLM Agents
Viaarxiv icon

Detecting Backdoor Samples in Contrastive Language Image Pretraining

Add code
Feb 03, 2025
Viaarxiv icon

Backdoor Token Unlearning: Exposing and Defending Backdoors in Pretrained Language Models

Add code
Jan 05, 2025
Figure 1 for Backdoor Token Unlearning: Exposing and Defending Backdoors in Pretrained Language Models
Figure 2 for Backdoor Token Unlearning: Exposing and Defending Backdoors in Pretrained Language Models
Figure 3 for Backdoor Token Unlearning: Exposing and Defending Backdoors in Pretrained Language Models
Figure 4 for Backdoor Token Unlearning: Exposing and Defending Backdoors in Pretrained Language Models
Viaarxiv icon

CROW: Eliminating Backdoors from Large Language Models via Internal Consistency Regularization

Add code
Nov 18, 2024
Figure 1 for CROW: Eliminating Backdoors from Large Language Models via Internal Consistency Regularization
Figure 2 for CROW: Eliminating Backdoors from Large Language Models via Internal Consistency Regularization
Figure 3 for CROW: Eliminating Backdoors from Large Language Models via Internal Consistency Regularization
Figure 4 for CROW: Eliminating Backdoors from Large Language Models via Internal Consistency Regularization
Viaarxiv icon

BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks

Add code
Oct 28, 2024
Figure 1 for BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks
Figure 2 for BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks
Figure 3 for BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks
Figure 4 for BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks
Viaarxiv icon

Expose Before You Defend: Unifying and Enhancing Backdoor Defenses via Exposed Models

Add code
Oct 25, 2024
Figure 1 for Expose Before You Defend: Unifying and Enhancing Backdoor Defenses via Exposed Models
Figure 2 for Expose Before You Defend: Unifying and Enhancing Backdoor Defenses via Exposed Models
Figure 3 for Expose Before You Defend: Unifying and Enhancing Backdoor Defenses via Exposed Models
Figure 4 for Expose Before You Defend: Unifying and Enhancing Backdoor Defenses via Exposed Models
Viaarxiv icon

AnyAttack: Towards Large-scale Self-supervised Generation of Targeted Adversarial Examples for Vision-Language Models

Add code
Oct 07, 2024
Figure 1 for AnyAttack: Towards Large-scale Self-supervised Generation of Targeted Adversarial Examples for Vision-Language Models
Figure 2 for AnyAttack: Towards Large-scale Self-supervised Generation of Targeted Adversarial Examples for Vision-Language Models
Figure 3 for AnyAttack: Towards Large-scale Self-supervised Generation of Targeted Adversarial Examples for Vision-Language Models
Figure 4 for AnyAttack: Towards Large-scale Self-supervised Generation of Targeted Adversarial Examples for Vision-Language Models
Viaarxiv icon

Adversarial Suffixes May Be Features Too!

Add code
Oct 01, 2024
Figure 1 for Adversarial Suffixes May Be Features Too!
Figure 2 for Adversarial Suffixes May Be Features Too!
Figure 3 for Adversarial Suffixes May Be Features Too!
Figure 4 for Adversarial Suffixes May Be Features Too!
Viaarxiv icon