Target Policy Smoothing


Search-R2: Enhancing Search-Integrated Reasoning via Actor-Refiner Collaboration

Add code
Feb 03, 2026
Viaarxiv icon

No More, No Less: Least-Privilege Language Models

Add code
Jan 30, 2026
Viaarxiv icon

IRL-DAL: Safe and Adaptive Trajectory Planning for Autonomous Driving via Energy-Guided Diffusion Models

Add code
Jan 30, 2026
Viaarxiv icon

Latent Spherical Flow Policy for Reinforcement Learning with Combinatorial Actions

Add code
Jan 29, 2026
Viaarxiv icon

Diffusion-Guided Backdoor Attacks in Real-World Reinforcement Learning

Add code
Jan 20, 2026
Viaarxiv icon

Efficient Inference for Inverse Reinforcement Learning and Dynamic Discrete Choice Models

Add code
Dec 30, 2025
Viaarxiv icon

Learning Controllable and Diverse Player Behaviors in Multi-Agent Environments

Add code
Dec 11, 2025
Figure 1 for Learning Controllable and Diverse Player Behaviors in Multi-Agent Environments
Figure 2 for Learning Controllable and Diverse Player Behaviors in Multi-Agent Environments
Figure 3 for Learning Controllable and Diverse Player Behaviors in Multi-Agent Environments
Figure 4 for Learning Controllable and Diverse Player Behaviors in Multi-Agent Environments
Viaarxiv icon

Balancing Centralized Learning and Distributed Self-Organization: A Hybrid Model for Embodied Morphogenesis

Add code
Nov 13, 2025
Viaarxiv icon

Multi-Modal Decentralized Reinforcement Learning for Modular Reconfigurable Lunar Robots

Add code
Oct 23, 2025
Viaarxiv icon

Real-Time Gait Adaptation for Quadrupeds using Model Predictive Control and Reinforcement Learning

Add code
Oct 23, 2025
Viaarxiv icon