Picture for Shahar Mendel

Shahar Mendel

Outcome-Based RL Provably Leads Transformers to Reason, but Only With the Right Data

Add code
Jan 21, 2026
Viaarxiv icon