Picture for Aleksei Arzhantsev

Aleksei Arzhantsev

Self-Consistency via Marginal Sharpening

Add code
May 27, 2026
Viaarxiv icon

Off-Policy Learning to Reason Works Because It Is More Pessimistic Than You Think

Add code
May 27, 2026
Viaarxiv icon