Picture for Zhechao Yu

Zhechao Yu

CLIPO: Contrastive Learning in Policy Optimization Generalizes RLVR

Add code
Mar 10, 2026
Viaarxiv icon

Eliminating Inductive Bias in Reward Models with Information-Theoretic Guidance

Add code
Dec 29, 2025
Viaarxiv icon