Picture for Yangzhen Wu

Yangzhen Wu

Learn Hard Problems During RL with Reference Guided Fine-tuning

Add code
Mar 05, 2026
Viaarxiv icon

The Lessons of Developing Process Reward Models in Mathematical Reasoning

Add code
Jan 13, 2025
Figure 1 for The Lessons of Developing Process Reward Models in Mathematical Reasoning
Figure 2 for The Lessons of Developing Process Reward Models in Mathematical Reasoning
Figure 3 for The Lessons of Developing Process Reward Models in Mathematical Reasoning
Figure 4 for The Lessons of Developing Process Reward Models in Mathematical Reasoning
Viaarxiv icon

An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models

Add code
Aug 01, 2024
Figure 1 for An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models
Figure 2 for An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models
Figure 3 for An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models
Figure 4 for An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models
Viaarxiv icon