Picture for Xinge Ye

Xinge Ye

CPO: Addressing Reward Ambiguity in Role-playing Dialogue via Comparative Policy Optimization

Add code
Aug 12, 2025
Viaarxiv icon