Picture for Lantao Mei

Lantao Mei

A Very Big Video Reasoning Suite

Add code
Feb 24, 2026
Viaarxiv icon

$f$-GRPO and Beyond: Divergence-Based Reinforcement Learning Algorithms for General LLM Alignment

Add code
Feb 05, 2026
Viaarxiv icon