Picture for Qiuyang Zhao

Qiuyang Zhao

HARVE: Hacking-Aware Reward-Head Vector Editing for Robust Reward Models

Add code
Jun 02, 2026
Viaarxiv icon

LayerTracer: A Joint Task-Particle and Vulnerable-Layer Analysis framework for Arbitrary Large Language Model Architectures

Add code
Apr 22, 2026
Viaarxiv icon