Reinforcement Learning


AffordanceGrasp-R1:Leveraging Reasoning-Based Affordance Segmentation with Reinforcement Learning for Robotic Grasping

Add code
Feb 03, 2026
Viaarxiv icon

CoBA-RL: Capability-Oriented Budget Allocation for Reinforcement Learning in LLMs

Add code
Feb 03, 2026
Viaarxiv icon

StepScorer: Accelerating Reinforcement Learning with Step-wise Scoring and Psychological Regret Modeling

Add code
Feb 03, 2026
Viaarxiv icon

medR: Reward Engineering for Clinical Offline Reinforcement Learning via Tri-Drive Potential Functions

Add code
Feb 03, 2026
Viaarxiv icon

From Scalar Rewards to Potential Trends: Shaping Potential Landscapes for Model-Based Reinforcement Learning

Add code
Feb 03, 2026
Viaarxiv icon

PEGRL: Improving Machine Translation by Post-Editing Guided Reinforcement Learning

Add code
Feb 03, 2026
Viaarxiv icon

Reinforcement Learning with Promising Tokens for Large Language Models

Add code
Feb 03, 2026
Viaarxiv icon

ECHO-2: A Large-Scale Distributed Rollout Framework for Cost-Efficient Reinforcement Learning

Add code
Feb 03, 2026
Viaarxiv icon

Adaptive Rollout Allocation for Online Reinforcement Learning with Verifiable Rewards

Add code
Feb 03, 2026
Viaarxiv icon

CPMobius: Iterative Coach-Player Reasoning for Data-Free Reinforcement Learning

Add code
Feb 03, 2026
Viaarxiv icon