Alert button
Picture for Zhuoran Yang

Zhuoran Yang

Alert button

Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning

Add code
Bookmark button
Alert button
May 29, 2023
Haoran He, Chenjia Bai, Kang Xu, Zhuoran Yang, Weinan Zhang, Dong Wang, Bin Zhao, Xuelong Li

Figure 1 for Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning
Figure 2 for Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning
Figure 3 for Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning
Figure 4 for Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning
Viaarxiv icon

Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning

Add code
Bookmark button
Alert button
May 08, 2023
Yulai Zhao, Zhuoran Yang, Zhaoran Wang, Jason D. Lee

Figure 1 for Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning
Viaarxiv icon

Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization

Add code
Bookmark button
Alert button
Mar 28, 2023
Haoran Xu, Li Jiang, Jianxiong Li, Zhuoran Yang, Zhaoran Wang, Victor Wai Kin Chan, Xianyuan Zhan

Figure 1 for Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
Figure 2 for Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
Figure 3 for Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
Figure 4 for Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
Viaarxiv icon

A Unified Framework of Policy Learning for Contextual Bandit with Confounding Bias and Missing Observations

Add code
Bookmark button
Alert button
Mar 20, 2023
Siyu Chen, Yitan Wang, Zhaoran Wang, Zhuoran Yang

Figure 1 for A Unified Framework of Policy Learning for Contextual Bandit with Confounding Bias and Missing Observations
Figure 2 for A Unified Framework of Policy Learning for Contextual Bandit with Confounding Bias and Missing Observations
Figure 3 for A Unified Framework of Policy Learning for Contextual Bandit with Confounding Bias and Missing Observations
Figure 4 for A Unified Framework of Policy Learning for Contextual Bandit with Confounding Bias and Missing Observations
Viaarxiv icon

Learning to Incentivize Information Acquisition: Proper Scoring Rules Meet Principal-Agent Model

Add code
Bookmark button
Alert button
Mar 15, 2023
Siyu Chen, Jibang Wu, Yifan Wu, Zhuoran Yang

Figure 1 for Learning to Incentivize Information Acquisition: Proper Scoring Rules Meet Principal-Agent Model
Viaarxiv icon

Can We Find Nash Equilibria at a Linear Rate in Markov Games?

Add code
Bookmark button
Alert button
Mar 03, 2023
Zhuoqing Song, Jason D. Lee, Zhuoran Yang

Figure 1 for Can We Find Nash Equilibria at a Linear Rate in Markov Games?
Figure 2 for Can We Find Nash Equilibria at a Linear Rate in Markov Games?
Figure 3 for Can We Find Nash Equilibria at a Linear Rate in Markov Games?
Figure 4 for Can We Find Nash Equilibria at a Linear Rate in Markov Games?
Viaarxiv icon

Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models with Reinforcement Learning

Add code
Bookmark button
Alert button
Feb 24, 2023
Ruitu Xu, Yifei Min, Tianhao Wang, Zhaoran Wang, Michael I. Jordan, Zhuoran Yang

Figure 1 for Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models with Reinforcement Learning
Figure 2 for Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models with Reinforcement Learning
Viaarxiv icon

Offline Policy Optimization in RL with Variance Regularizaton

Add code
Bookmark button
Alert button
Dec 29, 2022
Riashat Islam, Samarth Sinha, Homanga Bharadhwaj, Samin Yeasar Arnob, Zhuoran Yang, Animesh Garg, Zhaoran Wang, Lihong Li, Doina Precup

Figure 1 for Offline Policy Optimization in RL with Variance Regularizaton
Figure 2 for Offline Policy Optimization in RL with Variance Regularizaton
Figure 3 for Offline Policy Optimization in RL with Variance Regularizaton
Figure 4 for Offline Policy Optimization in RL with Variance Regularizaton
Viaarxiv icon

Offline Reinforcement Learning for Human-Guided Human-Machine Interaction with Private Information

Add code
Bookmark button
Alert button
Dec 23, 2022
Zuyue Fu, Zhengling Qi, Zhuoran Yang, Zhaoran Wang, Lan Wang

Figure 1 for Offline Reinforcement Learning for Human-Guided Human-Machine Interaction with Private Information
Figure 2 for Offline Reinforcement Learning for Human-Guided Human-Machine Interaction with Private Information
Figure 3 for Offline Reinforcement Learning for Human-Guided Human-Machine Interaction with Private Information
Figure 4 for Offline Reinforcement Learning for Human-Guided Human-Machine Interaction with Private Information
Viaarxiv icon

Policy learning "without'' overlap: Pessimism and generalized empirical Bernstein's inequality

Add code
Bookmark button
Alert button
Dec 19, 2022
Ying Jin, Zhimei Ren, Zhuoran Yang, Zhaoran Wang

Figure 1 for Policy learning "without'' overlap: Pessimism and generalized empirical Bernstein's inequality
Figure 2 for Policy learning "without'' overlap: Pessimism and generalized empirical Bernstein's inequality
Figure 3 for Policy learning "without'' overlap: Pessimism and generalized empirical Bernstein's inequality
Figure 4 for Policy learning "without'' overlap: Pessimism and generalized empirical Bernstein's inequality
Viaarxiv icon