Picture for Ling Pan

Ling Pan

Reformulate LLM Reinforcement Learning for Efficient Training under Black-box Discrepancy

Add code
Jun 09, 2026
Viaarxiv icon

When RL Fails after SFT: Rejuvenating Model Plasticity for Robust SFT-to-RL Handoff

Add code
Jun 07, 2026
Viaarxiv icon

Edit-R2: Context-Aware Reinforcement Learning for Multi-Turn Image Editing

Add code
Jun 04, 2026
Viaarxiv icon

Local Guidance, Global Impact: Gaussian-Reshaped Trust Region Unlocks Behavior Transitions

Add code
Jun 04, 2026
Viaarxiv icon

Uncovering Linguistic Fragility in Vision-Language-Action Models via Diversity-Aware Red Teaming

Add code
Apr 07, 2026
Viaarxiv icon

Complementary Reinforcement Learning

Add code
Mar 18, 2026
Viaarxiv icon

Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem

Add code
Dec 31, 2025
Viaarxiv icon

GARDO: Reinforcing Diffusion Models without Reward Hacking

Add code
Dec 30, 2025
Viaarxiv icon

Asymmetric Proximal Policy Optimization: mini-critics boost LLM reasoning

Add code
Oct 02, 2025
Viaarxiv icon

Generative Flow Networks for Personalized Multimedia Systems: A Case Study on Short Video Feeds

Add code
Aug 23, 2025
Viaarxiv icon