Picture for Ranjie Duan

Ranjie Duan

Towards Class-wise Fair Adversarial Training via Anti-Bias Soft Label Distillation

Add code
Jun 10, 2025
Viaarxiv icon

The Eye of Sherlock Holmes: Uncovering User Private Attribute Profiling via Vision-Language Model Agentic Framework

Add code
May 25, 2025
Figure 1 for The Eye of Sherlock Holmes: Uncovering User Private Attribute Profiling via Vision-Language Model Agentic Framework
Figure 2 for The Eye of Sherlock Holmes: Uncovering User Private Attribute Profiling via Vision-Language Model Agentic Framework
Figure 3 for The Eye of Sherlock Holmes: Uncovering User Private Attribute Profiling via Vision-Language Model Agentic Framework
Figure 4 for The Eye of Sherlock Holmes: Uncovering User Private Attribute Profiling via Vision-Language Model Agentic Framework
Viaarxiv icon

Enhancing Adversarial Robustness of Vision Language Models via Adversarial Mixture Prompt Tuning

Add code
May 23, 2025
Viaarxiv icon

DREAM: Disentangling Risks to Enhance Safety Alignment in Multimodal Large Language Models

Add code
Apr 25, 2025
Viaarxiv icon

STAIR: Improving Safety Alignment with Introspective Reasoning

Add code
Feb 04, 2025
Figure 1 for STAIR: Improving Safety Alignment with Introspective Reasoning
Figure 2 for STAIR: Improving Safety Alignment with Introspective Reasoning
Figure 3 for STAIR: Improving Safety Alignment with Introspective Reasoning
Figure 4 for STAIR: Improving Safety Alignment with Introspective Reasoning
Viaarxiv icon

Mirage in the Eyes: Hallucination Attack on Multi-modal Large Language Models with Only Attention Sink

Add code
Jan 25, 2025
Figure 1 for Mirage in the Eyes: Hallucination Attack on Multi-modal Large Language Models with Only Attention Sink
Figure 2 for Mirage in the Eyes: Hallucination Attack on Multi-modal Large Language Models with Only Attention Sink
Figure 3 for Mirage in the Eyes: Hallucination Attack on Multi-modal Large Language Models with Only Attention Sink
Figure 4 for Mirage in the Eyes: Hallucination Attack on Multi-modal Large Language Models with Only Attention Sink
Viaarxiv icon

Jailbreaking Multimodal Large Language Models via Shuffle Inconsistency

Add code
Jan 09, 2025
Figure 1 for Jailbreaking Multimodal Large Language Models via Shuffle Inconsistency
Figure 2 for Jailbreaking Multimodal Large Language Models via Shuffle Inconsistency
Figure 3 for Jailbreaking Multimodal Large Language Models via Shuffle Inconsistency
Figure 4 for Jailbreaking Multimodal Large Language Models via Shuffle Inconsistency
Viaarxiv icon

MRJ-Agent: An Effective Jailbreak Agent for Multi-Round Dialogue

Add code
Nov 06, 2024
Figure 1 for MRJ-Agent: An Effective Jailbreak Agent for Multi-Round Dialogue
Figure 2 for MRJ-Agent: An Effective Jailbreak Agent for Multi-Round Dialogue
Figure 3 for MRJ-Agent: An Effective Jailbreak Agent for Multi-Round Dialogue
Figure 4 for MRJ-Agent: An Effective Jailbreak Agent for Multi-Round Dialogue
Viaarxiv icon

RT-Attack: Jailbreaking Text-to-Image Models via Random Token

Add code
Aug 27, 2024
Figure 1 for RT-Attack: Jailbreaking Text-to-Image Models via Random Token
Figure 2 for RT-Attack: Jailbreaking Text-to-Image Models via Random Token
Figure 3 for RT-Attack: Jailbreaking Text-to-Image Models via Random Token
Figure 4 for RT-Attack: Jailbreaking Text-to-Image Models via Random Token
Viaarxiv icon

Revisiting and Exploring Efficient Fast Adversarial Training via LAW: Lipschitz Regularization and Auto Weight Averaging

Add code
Aug 22, 2023
Figure 1 for Revisiting and Exploring Efficient Fast Adversarial Training via LAW: Lipschitz Regularization and Auto Weight Averaging
Figure 2 for Revisiting and Exploring Efficient Fast Adversarial Training via LAW: Lipschitz Regularization and Auto Weight Averaging
Figure 3 for Revisiting and Exploring Efficient Fast Adversarial Training via LAW: Lipschitz Regularization and Auto Weight Averaging
Figure 4 for Revisiting and Exploring Efficient Fast Adversarial Training via LAW: Lipschitz Regularization and Auto Weight Averaging
Viaarxiv icon