Picture for Zhuoran Yang

Zhuoran Yang

From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems

May 30, 2024
Viaarxiv icon

STRIDE: A Tool-Assisted LLM Agent Framework for Strategic and Interactive Decision-Making

Add code
May 28, 2024
Viaarxiv icon

Pessimistic Value Iteration for Multi-Task Data Sharing in Offline Reinforcement Learning

Add code
Apr 30, 2024
Viaarxiv icon

Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation

Apr 19, 2024
Viaarxiv icon

A Mean-Field Analysis of Neural Gradient Descent-Ascent: Applications to Functional Conditional Moment Equations

Apr 18, 2024
Viaarxiv icon

Unveil Conditional Diffusion Models with Classifier-free Guidance: A Sharp Statistical Theory

Mar 18, 2024
Figure 1 for Unveil Conditional Diffusion Models with Classifier-free Guidance: A Sharp Statistical Theory
Figure 2 for Unveil Conditional Diffusion Models with Classifier-free Guidance: A Sharp Statistical Theory
Figure 3 for Unveil Conditional Diffusion Models with Classifier-free Guidance: A Sharp Statistical Theory
Figure 4 for Unveil Conditional Diffusion Models with Classifier-free Guidance: A Sharp Statistical Theory
Viaarxiv icon

On the Role of Information Structure in Reinforcement Learning for Partially-Observable Sequential Teams and Games

Mar 01, 2024
Figure 1 for On the Role of Information Structure in Reinforcement Learning for Partially-Observable Sequential Teams and Games
Figure 2 for On the Role of Information Structure in Reinforcement Learning for Partially-Observable Sequential Teams and Games
Figure 3 for On the Role of Information Structure in Reinforcement Learning for Partially-Observable Sequential Teams and Games
Figure 4 for On the Role of Information Structure in Reinforcement Learning for Partially-Observable Sequential Teams and Games
Viaarxiv icon

Training Dynamics of Multi-Head Softmax Attention for In-Context Learning: Emergence, Convergence, and Optimality

Feb 29, 2024
Viaarxiv icon

Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning

Feb 16, 2024
Viaarxiv icon

Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF

Feb 10, 2024
Figure 1 for Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
Figure 2 for Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
Figure 3 for Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
Figure 4 for Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
Viaarxiv icon