Picture for Yongbo Gai

Yongbo Gai

Grounding the Score: Explicit Visual Premise Verification for Reliable Vision-Language Process Reward Models

Add code
Mar 17, 2026
Viaarxiv icon

Rationale Matters: Learning Transferable Rubrics via Proxy-Guided Critique for VLMReward Models

Add code
Mar 17, 2026
Viaarxiv icon

CLIPO: Contrastive Learning in Policy Optimization Generalizes RLVR

Add code
Mar 10, 2026
Viaarxiv icon

Open Rubric System: Scaling Reinforcement Learning with Pairwise Adaptive Rubric

Add code
Feb 15, 2026
Viaarxiv icon