Picture for Yunhao Feng

Yunhao Feng

Constitutional On-Policy Safe Distillation

Add code
Jun 02, 2026
Viaarxiv icon

BraveGuard: From Open-World Threats to Safer Computer-Use Agents

Add code
May 31, 2026
Viaarxiv icon

Position: AI Safety Requires Effective Controllability

Add code
May 26, 2026
Viaarxiv icon

SkillTrojan: Backdoor Attacks on Skill-Based Agent Systems

Add code
Apr 08, 2026
Viaarxiv icon

AgentHazard: A Benchmark for Evaluating Harmful Behavior in Computer-Use Agents

Add code
Apr 03, 2026
Viaarxiv icon

BackdoorAgent: A Unified Framework for Backdoor Attacks on LLM-based Agents

Add code
Jan 08, 2026
Viaarxiv icon