Picture for Yurun Yuan

Yurun Yuan

Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning

Add code
May 21, 2025
Viaarxiv icon