Picture for Zhuoran Yang

Zhuoran Yang

Joint Differentiable Optimization and Verification for Certified Reinforcement Learning

Add code
Jan 28, 2022
Viaarxiv icon

Exponential Family Model-Based Reinforcement Learning via Score Matching

Add code
Dec 28, 2021
Figure 1 for Exponential Family Model-Based Reinforcement Learning via Score Matching
Viaarxiv icon

Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic

Add code
Dec 27, 2021
Viaarxiv icon

Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopic Followers?

Add code
Dec 27, 2021
Viaarxiv icon

ElegantRL-Podracer: Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning

Add code
Dec 11, 2021
Figure 1 for ElegantRL-Podracer: Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning
Figure 2 for ElegantRL-Podracer: Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning
Figure 3 for ElegantRL-Podracer: Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning
Figure 4 for ElegantRL-Podracer: Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning
Viaarxiv icon

Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning

Add code
Nov 06, 2021
Viaarxiv icon

SCORE: Spurious COrrelation REduction for Offline Reinforcement Learning

Add code
Oct 24, 2021
Figure 1 for SCORE: Spurious COrrelation REduction for Offline Reinforcement Learning
Figure 2 for SCORE: Spurious COrrelation REduction for Offline Reinforcement Learning
Figure 3 for SCORE: Spurious COrrelation REduction for Offline Reinforcement Learning
Figure 4 for SCORE: Spurious COrrelation REduction for Offline Reinforcement Learning
Viaarxiv icon

On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game

Add code
Oct 19, 2021
Viaarxiv icon

Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs

Add code
Oct 18, 2021
Figure 1 for Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs
Viaarxiv icon

Inducing Equilibria via Incentives: Simultaneous Design-and-Play Finds Global Optima

Add code
Oct 12, 2021
Figure 1 for Inducing Equilibria via Incentives: Simultaneous Design-and-Play Finds Global Optima
Figure 2 for Inducing Equilibria via Incentives: Simultaneous Design-and-Play Finds Global Optima
Figure 3 for Inducing Equilibria via Incentives: Simultaneous Design-and-Play Finds Global Optima
Viaarxiv icon