reinforcement learning


Quantile of Means: A Bonus-Free Ensemble Method for Minimax Optimal Reinforcement Learning

Add code
Jun 18, 2026
Viaarxiv icon

Direct Advantage Estimation for Scalable and Sample-efficient Deep Reinforcement Learning

Add code
Jun 18, 2026
Viaarxiv icon

Multi-Granular Attention-Driven Reinforcement Learning Framework for Web Intelligent Enhancement Systems

Add code
Jun 18, 2026
Viaarxiv icon

Process-Verified Reinforcement Learning for Theorem Proving via Lean

Add code
Jun 18, 2026
Viaarxiv icon

Augmenting Game AI with Deep Reinforcement Learning

Add code
Jun 18, 2026
Viaarxiv icon

CRAX: Fast Safe Reinforcement Learning Benchmarking

Add code
Jun 18, 2026
Viaarxiv icon

A Model-Driven Approach for Developing Families of Reinforcement Learning Environments

Add code
Jun 18, 2026
Viaarxiv icon

Temporal Self-Imitation Learning

Add code
Jun 18, 2026
Viaarxiv icon

MetaResearcher: Scaling Deep Research via Self-Reflective Reinforcement Learning in Adversarial Virtual Environments

Add code
Jun 18, 2026
Viaarxiv icon

Reinforcement Learning Foundation Models Should Already Be A Thing

Add code
Jun 18, 2026
Viaarxiv icon