Picture for Aishan Liu

Aishan Liu

MASteer: Multi-Agent Adaptive Steer Strategy for End-to-End LLM Trustworthiness Repair

Add code
Aug 09, 2025
Viaarxiv icon

Investigating Training Data Detection in AI Coders

Add code
Jul 23, 2025
Viaarxiv icon

ICLShield: Exploring and Mitigating In-Context Learning Backdoor Attacks

Add code
Jul 02, 2025
Viaarxiv icon

SafeMobile: Chain-level Jailbreak Detection and Automated Evaluation for Multimodal Mobile Agents

Add code
Jul 01, 2025
Viaarxiv icon

AGENTSAFE: Benchmarking the Safety of Embodied Agents on Hazardous Instructions

Add code
Jun 17, 2025
Viaarxiv icon

Pushing the Limits of Safety: A Technical Report on the ATLAS Challenge 2025

Add code
Jun 14, 2025
Viaarxiv icon

ME: Trigger Element Combination Backdoor Attack on Copyright Infringement

Add code
Jun 12, 2025
Viaarxiv icon

SRD: Reinforcement-Learned Semantic Perturbation for Backdoor Defense in VLMs

Add code
Jun 05, 2025
Viaarxiv icon

Manipulating Multimodal Agents via Cross-Modal Prompt Injection

Add code
Apr 22, 2025
Viaarxiv icon

T2VShield: Model-Agnostic Jailbreak Defense for Text-to-Video Models

Add code
Apr 22, 2025
Figure 1 for T2VShield: Model-Agnostic Jailbreak Defense for Text-to-Video Models
Figure 2 for T2VShield: Model-Agnostic Jailbreak Defense for Text-to-Video Models
Figure 3 for T2VShield: Model-Agnostic Jailbreak Defense for Text-to-Video Models
Figure 4 for T2VShield: Model-Agnostic Jailbreak Defense for Text-to-Video Models
Viaarxiv icon