Picture for Guowen Xu

Guowen Xu

The Art of (Mis)alignment: How Fine-Tuning Methods Effectively Misalign and Realign LLMs in Post-Training

Add code
Apr 09, 2026
Viaarxiv icon

Decoder Gradient Shields: A Family of Provable and High-Fidelity Methods Against Gradient-Based Box-Free Watermark Removal

Add code
Jan 17, 2026
Viaarxiv icon

Hidden Tail: Adversarial Image Causing Stealthy Resource Consumption in Vision-Language Models

Add code
Aug 26, 2025
Viaarxiv icon

FIGhost: Fluorescent Ink-based Stealthy and Flexible Backdoor Attacks on Physical Traffic Sign Recognition

Add code
May 17, 2025
Viaarxiv icon

MPMA: Preference Manipulation Attack Against Model Context Protocol

Add code
May 16, 2025
Figure 1 for MPMA: Preference Manipulation Attack Against Model Context Protocol
Figure 2 for MPMA: Preference Manipulation Attack Against Model Context Protocol
Figure 3 for MPMA: Preference Manipulation Attack Against Model Context Protocol
Figure 4 for MPMA: Preference Manipulation Attack Against Model Context Protocol
Viaarxiv icon

The Ripple Effect: On Unforeseen Complications of Backdoor Attacks

Add code
May 16, 2025
Viaarxiv icon

BadLingual: A Novel Lingual-Backdoor Attack against Large Language Models

Add code
May 06, 2025
Figure 1 for BadLingual: A Novel Lingual-Backdoor Attack against Large Language Models
Figure 2 for BadLingual: A Novel Lingual-Backdoor Attack against Large Language Models
Figure 3 for BadLingual: A Novel Lingual-Backdoor Attack against Large Language Models
Figure 4 for BadLingual: A Novel Lingual-Backdoor Attack against Large Language Models
Viaarxiv icon

A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment

Add code
Apr 22, 2025
Viaarxiv icon

Decoder Gradient Shield: Provable and High-Fidelity Prevention of Gradient-Based Box-Free Watermark Removal

Add code
Feb 28, 2025
Figure 1 for Decoder Gradient Shield: Provable and High-Fidelity Prevention of Gradient-Based Box-Free Watermark Removal
Figure 2 for Decoder Gradient Shield: Provable and High-Fidelity Prevention of Gradient-Based Box-Free Watermark Removal
Figure 3 for Decoder Gradient Shield: Provable and High-Fidelity Prevention of Gradient-Based Box-Free Watermark Removal
Figure 4 for Decoder Gradient Shield: Provable and High-Fidelity Prevention of Gradient-Based Box-Free Watermark Removal
Viaarxiv icon

CP-Guard+: A New Paradigm for Malicious Agent Detection and Defense in Collaborative Perception

Add code
Feb 07, 2025
Viaarxiv icon