Picture for Jiawen Zhang

Jiawen Zhang

REFLECTOR: Internalizing Step-wise Reflection against Indirect Jailbreak

Add code
May 20, 2026
Viaarxiv icon

TGPP: Trajectory-Guided Plug-and-Play Priors for Sparse Radio Map Reconstruction

Add code
May 07, 2026
Viaarxiv icon

Mind Your HEARTBEAT! Claw Background Execution Inherently Enables Silent Memory Pollution

Add code
Mar 25, 2026
Viaarxiv icon

Understanding and Preserving Safety in Fine-Tuned LLMs

Add code
Jan 15, 2026
Viaarxiv icon

Safety at One Shot: Patching Fine-Tuned LLMs with A Single Instance

Add code
Jan 06, 2026
Viaarxiv icon

Are Time-Series Foundation Models Deployment-Ready? A Systematic Study of Adversarial Robustness Across Domains

Add code
May 26, 2025
Viaarxiv icon

SHAPE : Self-Improved Visual Preference Alignment by Iteratively Generating Holistic Winner

Add code
Mar 06, 2025
Figure 1 for SHAPE : Self-Improved Visual Preference Alignment by Iteratively Generating Holistic Winner
Figure 2 for SHAPE : Self-Improved Visual Preference Alignment by Iteratively Generating Holistic Winner
Figure 3 for SHAPE : Self-Improved Visual Preference Alignment by Iteratively Generating Holistic Winner
Figure 4 for SHAPE : Self-Improved Visual Preference Alignment by Iteratively Generating Holistic Winner
Viaarxiv icon

Unify and Anchor: A Context-Aware Transformer for Cross-Domain Time Series Forecasting

Add code
Mar 03, 2025
Viaarxiv icon

SecPE: Secure Prompt Ensembling for Private and Robust Large Language Models

Add code
Feb 02, 2025
Figure 1 for SecPE: Secure Prompt Ensembling for Private and Robust Large Language Models
Figure 2 for SecPE: Secure Prompt Ensembling for Private and Robust Large Language Models
Figure 3 for SecPE: Secure Prompt Ensembling for Private and Robust Large Language Models
Figure 4 for SecPE: Secure Prompt Ensembling for Private and Robust Large Language Models
Viaarxiv icon

Activation Approximations Can Incur Safety Vulnerabilities Even in Aligned LLMs: Comprehensive Analysis and Defense

Add code
Feb 02, 2025
Figure 1 for Activation Approximations Can Incur Safety Vulnerabilities Even in Aligned LLMs: Comprehensive Analysis and Defense
Figure 2 for Activation Approximations Can Incur Safety Vulnerabilities Even in Aligned LLMs: Comprehensive Analysis and Defense
Figure 3 for Activation Approximations Can Incur Safety Vulnerabilities Even in Aligned LLMs: Comprehensive Analysis and Defense
Figure 4 for Activation Approximations Can Incur Safety Vulnerabilities Even in Aligned LLMs: Comprehensive Analysis and Defense
Viaarxiv icon