Picture for Yuxin Zuo

Yuxin Zuo

JustRL: Scaling a 1.5B LLM with a Simple RL Recipe

Add code
Dec 18, 2025
Viaarxiv icon

P1: Mastering Physics Olympiads with Reinforcement Learning

Add code
Nov 17, 2025
Viaarxiv icon

FlowRL: Matching Reward Distributions for LLM Reasoning

Add code
Sep 18, 2025
Viaarxiv icon

SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

Add code
Sep 11, 2025
Viaarxiv icon

A Survey of Reinforcement Learning for Large Reasoning Models

Add code
Sep 10, 2025
Viaarxiv icon

Towards a Unified View of Large Language Model Post-Training

Add code
Sep 04, 2025
Figure 1 for Towards a Unified View of Large Language Model Post-Training
Figure 2 for Towards a Unified View of Large Language Model Post-Training
Figure 3 for Towards a Unified View of Large Language Model Post-Training
Figure 4 for Towards a Unified View of Large Language Model Post-Training
Viaarxiv icon

Automating Exploratory Multiomics Research via Language Models

Add code
Jun 09, 2025
Figure 1 for Automating Exploratory Multiomics Research via Language Models
Figure 2 for Automating Exploratory Multiomics Research via Language Models
Figure 3 for Automating Exploratory Multiomics Research via Language Models
Figure 4 for Automating Exploratory Multiomics Research via Language Models
Viaarxiv icon

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Add code
May 28, 2025
Figure 1 for The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models
Figure 2 for The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models
Figure 3 for The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models
Figure 4 for The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models
Viaarxiv icon

TTRL: Test-Time Reinforcement Learning

Add code
Apr 22, 2025
Figure 1 for TTRL: Test-Time Reinforcement Learning
Figure 2 for TTRL: Test-Time Reinforcement Learning
Figure 3 for TTRL: Test-Time Reinforcement Learning
Figure 4 for TTRL: Test-Time Reinforcement Learning
Viaarxiv icon

Towards Event Extraction with Massive Types: LLM-based Collaborative Annotation and Partitioning Extraction

Add code
Mar 04, 2025
Viaarxiv icon