Picture for Zhuoran Yang

Zhuoran Yang

Global Convergence of Policy Gradient for Linear-Quadratic Mean-Field Control/Game in Continuous Time

Add code
Aug 16, 2020
Figure 1 for Global Convergence of Policy Gradient for Linear-Quadratic Mean-Field Control/Game in Continuous Time
Figure 2 for Global Convergence of Policy Gradient for Linear-Quadratic Mean-Field Control/Game in Continuous Time
Viaarxiv icon

Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy

Add code
Aug 02, 2020
Figure 1 for Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Viaarxiv icon

Understanding Implicit Regularization in Over-Parameterized Nonlinear Statistical Model

Add code
Jul 16, 2020
Figure 1 for Understanding Implicit Regularization in Over-Parameterized Nonlinear Statistical Model
Figure 2 for Understanding Implicit Regularization in Over-Parameterized Nonlinear Statistical Model
Figure 3 for Understanding Implicit Regularization in Over-Parameterized Nonlinear Statistical Model
Figure 4 for Understanding Implicit Regularization in Over-Parameterized Nonlinear Statistical Model
Viaarxiv icon

A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic

Add code
Jul 10, 2020
Figure 1 for A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic
Viaarxiv icon

Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach

Add code
Jul 02, 2020
Figure 1 for Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach
Figure 2 for Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach
Viaarxiv icon

Dynamic Regret of Policy Optimization in Non-stationary Environments

Add code
Jun 30, 2020
Viaarxiv icon

On the Global Optimality of Model-Agnostic Meta-Learning

Add code
Jun 23, 2020
Viaarxiv icon

Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff in Regret

Add code
Jun 22, 2020
Figure 1 for Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff in Regret
Figure 2 for Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff in Regret
Viaarxiv icon

Provably Efficient Causal Reinforcement Learning with Confounded Observational Data

Add code
Jun 22, 2020
Figure 1 for Provably Efficient Causal Reinforcement Learning with Confounded Observational Data
Figure 2 for Provably Efficient Causal Reinforcement Learning with Confounded Observational Data
Figure 3 for Provably Efficient Causal Reinforcement Learning with Confounded Observational Data
Figure 4 for Provably Efficient Causal Reinforcement Learning with Confounded Observational Data
Viaarxiv icon

Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning

Add code
Jun 21, 2020
Viaarxiv icon