Model Based Reinforcement Learning


Reinforcement World Model Learning for LLM-based Agents

Add code
Feb 05, 2026
Viaarxiv icon

Data-Centric Interpretability for LLM-based Multi-Agent Reinforcement Learning

Add code
Feb 05, 2026
Viaarxiv icon

RL-VLA$^3$: Reinforcement Learning VLA Accelerating via Full Asynchronism

Add code
Feb 05, 2026
Viaarxiv icon

Stop Rewarding Hallucinated Steps: Faithfulness-Aware Step-Level Reinforcement Learning for Small Reasoning Models

Add code
Feb 05, 2026
Viaarxiv icon

Laplacian Representations for Decision-Time Planning

Add code
Feb 04, 2026
Viaarxiv icon

Quantum Reinforcement Learning with Transformers for the Capacitated Vehicle Routing Problem

Add code
Feb 05, 2026
Viaarxiv icon

TKG-Thinker: Towards Dynamic Reasoning over Temporal Knowledge Graphs via Agentic Reinforcement Learning

Add code
Feb 05, 2026
Viaarxiv icon

Can vision language models learn intuitive physics from interaction?

Add code
Feb 05, 2026
Viaarxiv icon

OmniVideo-R1: Reinforcing Audio-visual Reasoning with Query Intention and Modality Attention

Add code
Feb 05, 2026
Viaarxiv icon

Rewards as Labels: Revisiting RLVR from a Classification Perspective

Add code
Feb 05, 2026
Viaarxiv icon