Q Learning


UAV Trajectory Optimization via Improved Noisy Deep Q-Network

Add code
Feb 05, 2026
Viaarxiv icon

EMA Policy Gradient: Taming Reinforcement Learning for LLMs with EMA Anchor and Top-k KL

Add code
Feb 04, 2026
Viaarxiv icon

Periodic Regularized Q-Learning

Add code
Feb 03, 2026
Viaarxiv icon

The Role of Target Update Frequencies in Q-Learning

Add code
Feb 03, 2026
Viaarxiv icon

Structuring Value Representations via Geometric Coherence in Markov Decision Processes

Add code
Feb 03, 2026
Viaarxiv icon

Reward Redistribution for CVaR MDPs using a Bellman Operator on L-infinity

Add code
Feb 03, 2026
Viaarxiv icon

Choice-Model-Assisted Q-learning for Delayed-Feedback Revenue Management

Add code
Feb 02, 2026
Viaarxiv icon

CRoSS: A Continual Robotic Simulation Suite for Scalable Reinforcement Learning with High Task Diversity and Realistic Physics Simulation

Add code
Feb 04, 2026
Viaarxiv icon

Q-ShiftDP: A Differentially Private Parameter-Shift Rule for Quantum Machine Learning

Add code
Feb 03, 2026
Viaarxiv icon

Causal Flow Q-Learning for Robust Offline Reinforcement Learning

Add code
Feb 02, 2026
Viaarxiv icon