Picture for Zhuoran Yang

Zhuoran Yang

Randomized Exploration for Reinforcement Learning with General Value Function Approximation

Add code
Jun 15, 2021
Figure 1 for Randomized Exploration for Reinforcement Learning with General Value Function Approximation
Figure 2 for Randomized Exploration for Reinforcement Learning with General Value Function Approximation
Figure 3 for Randomized Exploration for Reinforcement Learning with General Value Function Approximation
Figure 4 for Randomized Exploration for Reinforcement Learning with General Value Function Approximation
Viaarxiv icon

Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality

Add code
Feb 27, 2021
Figure 1 for Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Figure 2 for Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Figure 3 for Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Viaarxiv icon

Instrumental Variable Value Iteration for Causal Offline Reinforcement Learning

Add code
Feb 19, 2021
Figure 1 for Instrumental Variable Value Iteration for Causal Offline Reinforcement Learning
Figure 2 for Instrumental Variable Value Iteration for Causal Offline Reinforcement Learning
Viaarxiv icon

A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm for Bilevel Optimization

Add code
Feb 15, 2021
Figure 1 for A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm for Bilevel Optimization
Figure 2 for A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm for Bilevel Optimization
Figure 3 for A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm for Bilevel Optimization
Figure 4 for A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm for Bilevel Optimization
Viaarxiv icon

Is Pessimism Provably Efficient for Offline RL?

Add code
Dec 30, 2020
Figure 1 for Is Pessimism Provably Efficient for Offline RL?
Figure 2 for Is Pessimism Provably Efficient for Offline RL?
Figure 3 for Is Pessimism Provably Efficient for Offline RL?
Viaarxiv icon

Risk-Sensitive Deep RL: Variance-Constrained Actor-Critic Provably Finds Globally Optimal Policy

Add code
Dec 28, 2020
Figure 1 for Risk-Sensitive Deep RL: Variance-Constrained Actor-Critic Provably Finds Globally Optimal Policy
Figure 2 for Risk-Sensitive Deep RL: Variance-Constrained Actor-Critic Provably Finds Globally Optimal Policy
Figure 3 for Risk-Sensitive Deep RL: Variance-Constrained Actor-Critic Provably Finds Globally Optimal Policy
Viaarxiv icon

Variational Transport: A Convergent Particle-BasedAlgorithm for Distributional Optimization

Add code
Dec 21, 2020
Viaarxiv icon

Bridging Exploration and General Function Approximation in Reinforcement Learning: Provably Efficient Kernel and Neural Value Iterations

Add code
Nov 09, 2020
Figure 1 for Bridging Exploration and General Function Approximation in Reinforcement Learning: Provably Efficient Kernel and Neural Value Iterations
Viaarxiv icon

Provable Fictitious Play for General Mean-Field Games

Add code
Oct 08, 2020
Viaarxiv icon

Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning

Add code
Aug 23, 2020
Figure 1 for Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning
Viaarxiv icon