Picture for Kongcheng Zhang

Kongcheng Zhang

Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning

Add code
Jun 10, 2025
Viaarxiv icon

SeRL: Self-Play Reinforcement Learning for Large Language Models with Limited Data

Add code
May 25, 2025
Viaarxiv icon

Reasoning with Reinforced Functional Token Tuning

Add code
Feb 19, 2025
Viaarxiv icon

Odyssey: Empowering Agents with Open-World Skills

Add code
Jul 22, 2024
Viaarxiv icon