Picture for Shashwat Saxena

Shashwat Saxena

OGPO: Sample Efficient Full-Finetuning of Generative Control Policies

Add code
May 04, 2026
Viaarxiv icon

Terminal Wrench: A Dataset of 331 Reward-Hackable Environments and 3,632 Exploit Trajectories

Add code
Apr 19, 2026
Viaarxiv icon

Hodoscope: Unsupervised Monitoring for AI Misbehaviors

Add code
Apr 13, 2026
Viaarxiv icon