Picture for Guannan Qu

Guannan Qu

Revisiting Policy Gradients for Restricted Policy Classes: Escaping Myopic Local Optima with $k$-step Policy Gradients

Add code
May 11, 2026
Viaarxiv icon

Towards Effective Theory of LLMs: A Representation Learning Approach

Add code
May 10, 2026
Viaarxiv icon

BLOCK-EM: Preventing Emergent Misalignment by Blocking Causal Features

Add code
Jan 31, 2026
Viaarxiv icon

Polynomial Convergence of Riemannian Diffusion Models

Add code
Jan 05, 2026
Viaarxiv icon

Transformer-Based Scalable Multi-Agent Reinforcement Learning for Networked Systems with Long-Range Interactions

Add code
Nov 17, 2025
Viaarxiv icon

Comparative Field Deployment of Reinforcement Learning and Model Predictive Control for Residential HVAC

Add code
Oct 01, 2025
Figure 1 for Comparative Field Deployment of Reinforcement Learning and Model Predictive Control for Residential HVAC
Figure 2 for Comparative Field Deployment of Reinforcement Learning and Model Predictive Control for Residential HVAC
Figure 3 for Comparative Field Deployment of Reinforcement Learning and Model Predictive Control for Residential HVAC
Figure 4 for Comparative Field Deployment of Reinforcement Learning and Model Predictive Control for Residential HVAC
Viaarxiv icon

A Theoretical Study of (Hyper) Self-Attention through the Lens of Interactions: Representation, Training, Generalization

Add code
Jun 06, 2025
Viaarxiv icon

Thinking Beyond Visibility: A Near-Optimal Policy Framework for Locally Interdependent Multi-Agent MDPs

Add code
Jun 04, 2025
Viaarxiv icon

Natural Policy Gradient for Average Reward Non-Stationary RL

Add code
Apr 23, 2025
Figure 1 for Natural Policy Gradient for Average Reward Non-Stationary RL
Figure 2 for Natural Policy Gradient for Average Reward Non-Stationary RL
Figure 3 for Natural Policy Gradient for Average Reward Non-Stationary RL
Figure 4 for Natural Policy Gradient for Average Reward Non-Stationary RL
Viaarxiv icon

Whole-Body Model-Predictive Control of Legged Robots with MuJoCo

Add code
Mar 06, 2025
Figure 1 for Whole-Body Model-Predictive Control of Legged Robots with MuJoCo
Figure 2 for Whole-Body Model-Predictive Control of Legged Robots with MuJoCo
Figure 3 for Whole-Body Model-Predictive Control of Legged Robots with MuJoCo
Figure 4 for Whole-Body Model-Predictive Control of Legged Robots with MuJoCo
Viaarxiv icon