Picture for Chen Bo Calvin Zhang

Chen Bo Calvin Zhang

Reliable Weak-to-Strong Monitoring of LLM Agents

Add code
Aug 26, 2025
Viaarxiv icon

ORSO: Accelerating Reward Design via Online Reward Selection and Policy Optimization

Add code
Oct 17, 2024
Viaarxiv icon