Picture for Jianfeng Shan

Jianfeng Shan

Exploiting Verification-Generation Gap: Test-Time Reinforcement Learning with Confidence-Conditioned Verification

Add code
Jun 02, 2026
Viaarxiv icon