Picture for Yuxin Zuo

Yuxin Zuo

Teaching Thinking Models to Reason with Tools: A Full-Pipeline Recipe for Tool-Integrated Reasoning

Add code
May 07, 2026
Viaarxiv icon

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Add code
Apr 14, 2026
Viaarxiv icon

Towards Knowledgeable Deep Research: Framework and Benchmark

Add code
Apr 09, 2026
Viaarxiv icon

TR-ICRL: Test-Time Rethinking for In-Context Reinforcement Learning

Add code
Apr 01, 2026
Viaarxiv icon

How Far Can Unsupervised RLVR Scale LLM Training?

Add code
Mar 09, 2026
Viaarxiv icon

WebWorld: A Large-Scale World Model for Web Agent Training

Add code
Feb 16, 2026
Viaarxiv icon

P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads

Add code
Feb 10, 2026
Viaarxiv icon

JustRL: Scaling a 1.5B LLM with a Simple RL Recipe

Add code
Dec 18, 2025
Viaarxiv icon

P1: Mastering Physics Olympiads with Reinforcement Learning

Add code
Nov 17, 2025
Viaarxiv icon

FlowRL: Matching Reward Distributions for LLM Reasoning

Add code
Sep 18, 2025
Viaarxiv icon