Picture for Xiang Zheng

Xiang Zheng

RedRFT: A Light-Weight Benchmark for Reinforcement Fine-Tuning-Based Red Teaming

Add code
Jun 04, 2025
Viaarxiv icon

PandaGuard: Systematic Evaluation of LLM Safety against Jailbreaking Attacks

Add code
May 22, 2025
Viaarxiv icon

Reinforced Diffuser for Red Teaming Large Vision-Language Models

Add code
Mar 08, 2025
Viaarxiv icon

SCORE: Saturated Consensus Relocalization in Semantic Line Maps

Add code
Mar 05, 2025
Viaarxiv icon

BatteryLife: A Comprehensive Dataset and Benchmark for Battery Life Prediction

Add code
Feb 26, 2025
Viaarxiv icon

CALM: Curiosity-Driven Auditing for Large Language Models

Add code
Jan 06, 2025
Figure 1 for CALM: Curiosity-Driven Auditing for Large Language Models
Figure 2 for CALM: Curiosity-Driven Auditing for Large Language Models
Figure 3 for CALM: Curiosity-Driven Auditing for Large Language Models
Figure 4 for CALM: Curiosity-Driven Auditing for Large Language Models
Viaarxiv icon

Wavelet Diffusion Neural Operator

Add code
Dec 06, 2024
Viaarxiv icon

BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks

Add code
Oct 28, 2024
Viaarxiv icon

Closed-loop Diffusion Control of Complex Physical Systems

Add code
Jul 31, 2024
Figure 1 for Closed-loop Diffusion Control of Complex Physical Systems
Figure 2 for Closed-loop Diffusion Control of Complex Physical Systems
Figure 3 for Closed-loop Diffusion Control of Complex Physical Systems
Figure 4 for Closed-loop Diffusion Control of Complex Physical Systems
Viaarxiv icon

Constrained Intrinsic Motivation for Reinforcement Learning

Add code
Jul 12, 2024
Viaarxiv icon