Picture for Yurun Yuan

Yurun Yuan

Reinforce LLM Reasoning through Multi-Agent Reflection

Add code
Jun 10, 2025
Figure 1 for Reinforce LLM Reasoning through Multi-Agent Reflection
Figure 2 for Reinforce LLM Reasoning through Multi-Agent Reflection
Figure 3 for Reinforce LLM Reasoning through Multi-Agent Reflection
Figure 4 for Reinforce LLM Reasoning through Multi-Agent Reflection
Viaarxiv icon

Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning

Add code
May 21, 2025
Figure 1 for Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning
Figure 2 for Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning
Figure 3 for Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning
Figure 4 for Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning
Viaarxiv icon