Reinforcement Learning


Beyond Where to Look: Trajectory-Guided Reinforcement Learning for Multimodal RLVR

Add code
Mar 27, 2026
Viaarxiv icon

Dynamic Token Compression for Efficient Video Understanding through Reinforcement Learning

Add code
Mar 27, 2026
Viaarxiv icon

Meta-Adaptive Beam Search Planning for Transformer-Based Reinforcement Learning Control of UAVs with Overhead Manipulators under Flight Disturbances

Add code
Mar 27, 2026
Viaarxiv icon

Designing Fatigue-Aware VR Interfaces via Biomechanical Models

Add code
Mar 27, 2026
Viaarxiv icon

120 Minutes and a Laptop: Minimalist Image-goal Navigation via Unsupervised Exploration and Offline RL

Add code
Mar 27, 2026
Viaarxiv icon

Automatic feature identification in least-squares policy iteration using the Koopman operator framework

Add code
Mar 27, 2026
Viaarxiv icon

Rethinking Recommendation Paradigms: From Pipelines to Agentic Recommender Systems

Add code
Mar 27, 2026
Viaarxiv icon

AutoB2G: A Large Language Model-Driven Agentic Framework For Automated Building-Grid Co-Simulation

Add code
Mar 27, 2026
Viaarxiv icon

Dynamic Tokenization via Reinforcement Patching: End-to-end Training and Zero-shot Transfer

Add code
Mar 27, 2026
Viaarxiv icon

VLA-OPD: Bridging Offline SFT and Online RL for Vision-Language-Action Models via On-Policy Distillation

Add code
Mar 27, 2026
Viaarxiv icon