Picture for Zhaoran Wang

Zhaoran Wang

Dynamic Regret of Policy Optimization in Non-stationary Environments

Add code
Jun 30, 2020
Viaarxiv icon

On the Global Optimality of Model-Agnostic Meta-Learning

Add code
Jun 23, 2020
Viaarxiv icon

Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff in Regret

Add code
Jun 22, 2020
Figure 1 for Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff in Regret
Figure 2 for Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff in Regret
Viaarxiv icon

Provably Efficient Causal Reinforcement Learning with Confounded Observational Data

Add code
Jun 22, 2020
Figure 1 for Provably Efficient Causal Reinforcement Learning with Confounded Observational Data
Figure 2 for Provably Efficient Causal Reinforcement Learning with Confounded Observational Data
Figure 3 for Provably Efficient Causal Reinforcement Learning with Confounded Observational Data
Figure 4 for Provably Efficient Causal Reinforcement Learning with Confounded Observational Data
Viaarxiv icon

Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning

Add code
Jun 21, 2020
Viaarxiv icon

Neural Certificates for Safe Control Policies

Add code
Jun 15, 2020
Figure 1 for Neural Certificates for Safe Control Policies
Figure 2 for Neural Certificates for Safe Control Policies
Figure 3 for Neural Certificates for Safe Control Policies
Figure 4 for Neural Certificates for Safe Control Policies
Viaarxiv icon

Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory

Add code
Jun 08, 2020
Figure 1 for Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory
Figure 2 for Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory
Viaarxiv icon

Deep Reinforcement Learning with Smooth Policy

Add code
Mar 24, 2020
Figure 1 for Deep Reinforcement Learning with Smooth Policy
Figure 2 for Deep Reinforcement Learning with Smooth Policy
Figure 3 for Deep Reinforcement Learning with Smooth Policy
Figure 4 for Deep Reinforcement Learning with Smooth Policy
Viaarxiv icon

Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium

Add code
Mar 21, 2020
Viaarxiv icon

Generative Adversarial Imitation Learning with Neural Networks: Global Optimality and Convergence Rate

Add code
Mar 08, 2020
Viaarxiv icon