Picture for Jingyong Ye

Jingyong Ye

Unrewarded Exploration in Large Language Models Reveals Latent Learning from Psychology

Add code
Jan 30, 2026
Viaarxiv icon

AAPO: Enhance the Reasoning Capabilities of LLMs with Advantage Momentum

Add code
May 20, 2025
Viaarxiv icon