Picture for Haoyang Dai

Haoyang Dai

OPERA: Aligning Open-Ended Reasoning via Objective Perplexity-based Reinforcement Learning

Add code
Jun 24, 2026
Viaarxiv icon