Picture for Caijun Xu

Caijun Xu

DenoiseRL: Bootstrapping Reasoning Models to Recover from Noisy Prefixes

Add code
May 27, 2026
Viaarxiv icon

Reinforcement Learning with Conditional Expectation Reward

Add code
Mar 11, 2026
Viaarxiv icon

CoDiQ: Test-Time Scaling for Controllable Difficult Question Generation

Add code
Feb 02, 2026
Viaarxiv icon

SCALER:Synthetic Scalable Adaptive Learning Environment for Reasoning

Add code
Jan 08, 2026
Viaarxiv icon