Reinforcement Learning


Multi-Agent Craftax: Benchmarking Open-Ended Multi-Agent Reinforcement Learning at the Hyperscale

Add code
Nov 07, 2025
Viaarxiv icon

Minority-Aware Satisfaction Estimation in Dialogue Systems via Preference-Adaptive Reinforcement Learning

Add code
Nov 07, 2025
Viaarxiv icon

Self-Interest and Systemic Benefits: Emergence of Collective Rationality in Mixed Autonomy Traffic Through Deep Reinforcement Learning

Add code
Nov 07, 2025
Viaarxiv icon

PreResQ-R1: Towards Fine-Grained Rank-and-Score Reinforcement Learning for Visual Quality Assessment via Preference-Response Disentangled Policy Optimization

Add code
Nov 07, 2025
Viaarxiv icon

TimeSearch-R: Adaptive Temporal Search for Long-Form Video Understanding via Self-Verification Reinforcement Learning

Add code
Nov 07, 2025
Viaarxiv icon

Visual Spatial Tuning

Add code
Nov 07, 2025
Viaarxiv icon

You Need Reasoning to Learn Reasoning: The Limitations of Label-Free RL in Weak Base Models

Add code
Nov 07, 2025
Viaarxiv icon

TeaRAG: A Token-Efficient Agentic Retrieval-Augmented Generation Framework

Add code
Nov 07, 2025
Viaarxiv icon

Quantum Boltzmann Machines for Sample-Efficient Reinforcement Learning

Add code
Nov 06, 2025
Viaarxiv icon

Fitting Reinforcement Learning Model to Behavioral Data under Bandits

Add code
Nov 06, 2025
Viaarxiv icon