Picture for Zhuoran Yang

Zhuoran Yang

STRIDE: A Tool-Assisted LLM Agent Framework for Strategic and Interactive Decision-Making

Add code
May 28, 2024
Viaarxiv icon

Pessimistic Value Iteration for Multi-Task Data Sharing in Offline Reinforcement Learning

Add code
Apr 30, 2024
Viaarxiv icon

Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation

Add code
Apr 19, 2024
Viaarxiv icon

A Mean-Field Analysis of Neural Gradient Descent-Ascent: Applications to Functional Conditional Moment Equations

Add code
Apr 18, 2024
Viaarxiv icon

Unveil Conditional Diffusion Models with Classifier-free Guidance: A Sharp Statistical Theory

Add code
Mar 18, 2024
Figure 1 for Unveil Conditional Diffusion Models with Classifier-free Guidance: A Sharp Statistical Theory
Figure 2 for Unveil Conditional Diffusion Models with Classifier-free Guidance: A Sharp Statistical Theory
Figure 3 for Unveil Conditional Diffusion Models with Classifier-free Guidance: A Sharp Statistical Theory
Figure 4 for Unveil Conditional Diffusion Models with Classifier-free Guidance: A Sharp Statistical Theory
Viaarxiv icon

On the Role of Information Structure in Reinforcement Learning for Partially-Observable Sequential Teams and Games

Add code
Mar 01, 2024
Figure 1 for On the Role of Information Structure in Reinforcement Learning for Partially-Observable Sequential Teams and Games
Figure 2 for On the Role of Information Structure in Reinforcement Learning for Partially-Observable Sequential Teams and Games
Figure 3 for On the Role of Information Structure in Reinforcement Learning for Partially-Observable Sequential Teams and Games
Figure 4 for On the Role of Information Structure in Reinforcement Learning for Partially-Observable Sequential Teams and Games
Viaarxiv icon

Training Dynamics of Multi-Head Softmax Attention for In-Context Learning: Emergence, Convergence, and Optimality

Add code
Feb 29, 2024
Viaarxiv icon

Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning

Add code
Feb 16, 2024
Viaarxiv icon

Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF

Add code
Feb 10, 2024
Figure 1 for Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
Figure 2 for Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
Figure 3 for Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
Figure 4 for Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
Viaarxiv icon

Symmetric Mean-field Langevin Dynamics for Distributional Minimax Problems

Add code
Dec 02, 2023
Figure 1 for Symmetric Mean-field Langevin Dynamics for Distributional Minimax Problems
Viaarxiv icon