Picture for Ganqu Cui

Ganqu Cui

Teaching Large Reasoning Models Effective Reflection

Add code
Jan 19, 2026
Viaarxiv icon

JustRL: Scaling a 1.5B LLM with a Simple RL Recipe

Add code
Dec 18, 2025
Viaarxiv icon

P1: Mastering Physics Olympiads with Reinforcement Learning

Add code
Nov 17, 2025
Viaarxiv icon

FlowRL: Matching Reward Distributions for LLM Reasoning

Add code
Sep 18, 2025
Viaarxiv icon

SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

Add code
Sep 11, 2025
Viaarxiv icon

HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?

Add code
Sep 10, 2025
Figure 1 for HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?
Figure 2 for HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?
Figure 3 for HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?
Figure 4 for HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?
Viaarxiv icon

A Survey of Reinforcement Learning for Large Reasoning Models

Add code
Sep 10, 2025
Viaarxiv icon

Wisdom of the Crowd: Reinforcement Learning from Coevolutionary Collective Feedback

Add code
Aug 17, 2025
Figure 1 for Wisdom of the Crowd: Reinforcement Learning from Coevolutionary Collective Feedback
Figure 2 for Wisdom of the Crowd: Reinforcement Learning from Coevolutionary Collective Feedback
Figure 3 for Wisdom of the Crowd: Reinforcement Learning from Coevolutionary Collective Feedback
Figure 4 for Wisdom of the Crowd: Reinforcement Learning from Coevolutionary Collective Feedback
Viaarxiv icon

InternBootcamp Technical Report: Boosting LLM Reasoning with Verifiable Task Scaling

Add code
Aug 12, 2025
Figure 1 for InternBootcamp Technical Report: Boosting LLM Reasoning with Verifiable Task Scaling
Figure 2 for InternBootcamp Technical Report: Boosting LLM Reasoning with Verifiable Task Scaling
Figure 3 for InternBootcamp Technical Report: Boosting LLM Reasoning with Verifiable Task Scaling
Figure 4 for InternBootcamp Technical Report: Boosting LLM Reasoning with Verifiable Task Scaling
Viaarxiv icon

MiniCPM4: Ultra-Efficient LLMs on End Devices

Add code
Jun 09, 2025
Figure 1 for MiniCPM4: Ultra-Efficient LLMs on End Devices
Figure 2 for MiniCPM4: Ultra-Efficient LLMs on End Devices
Figure 3 for MiniCPM4: Ultra-Efficient LLMs on End Devices
Figure 4 for MiniCPM4: Ultra-Efficient LLMs on End Devices
Viaarxiv icon