Picture for Devin Chen

Devin Chen

Hindsight-Anchored Policy Optimization: Turning Failure into Feedback in Sparse Reward Settings

Add code
Mar 11, 2026
Viaarxiv icon