Alert button
Picture for Zhuoran Yang

Zhuoran Yang

Alert button

Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning

Add code
Bookmark button
Alert button
Jul 29, 2022
Shuang Qiu, Lingxiao Wang, Chenjia Bai, Zhuoran Yang, Zhaoran Wang

Figure 1 for Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning
Figure 2 for Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning
Figure 3 for Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning
Figure 4 for Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning
Viaarxiv icon

Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions

Add code
Bookmark button
Alert button
Jul 25, 2022
Shuang Qiu, Xiaohan Wei, Jieping Ye, Zhaoran Wang, Zhuoran Yang

Figure 1 for Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions
Figure 2 for Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions
Viaarxiv icon

Decentralized Optimistic Hyperpolicy Mirror Descent: Provably No-Regret Learning in Markov Games

Add code
Bookmark button
Alert button
Jun 03, 2022
Wenhao Zhan, Jason D. Lee, Zhuoran Yang

Viaarxiv icon

Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes

Add code
Bookmark button
Alert button
May 26, 2022
Miao Lu, Yifei Min, Zhaoran Wang, Zhuoran Yang

Figure 1 for Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes
Figure 2 for Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes
Figure 3 for Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes
Figure 4 for Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes
Viaarxiv icon

Embed to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency

Add code
Bookmark button
Alert button
May 26, 2022
Lingxiao Wang, Qi Cai, Zhuoran Yang, Zhaoran Wang

Figure 1 for Embed to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency
Viaarxiv icon

Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation

Add code
Bookmark button
Alert button
May 24, 2022
Xiaoyu Chen, Han Zhong, Zhuoran Yang, Zhaoran Wang, Liwei Wang

Viaarxiv icon

Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning

Add code
Bookmark button
Alert button
May 05, 2022
Boxiang Lyu, Zhaoran Wang, Mladen Kolar, Zhuoran Yang

Viaarxiv icon

Sample-Efficient Reinforcement Learning for POMDPs with Linear Function Approximations

Add code
Bookmark button
Alert button
Apr 20, 2022
Qi Cai, Zhuoran Yang, Zhaoran Wang

Figure 1 for Sample-Efficient Reinforcement Learning for POMDPs with Linear Function Approximations
Figure 2 for Sample-Efficient Reinforcement Learning for POMDPs with Linear Function Approximations
Viaarxiv icon

Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets

Add code
Bookmark button
Alert button
Mar 07, 2022
Yifei Min, Tianhao Wang, Ruitu Xu, Zhaoran Wang, Michael I. Jordan, Zhuoran Yang

Figure 1 for Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets
Viaarxiv icon

The Best of Both Worlds: Reinforcement Learning with Logarithmic Regret and Policy Switches

Add code
Bookmark button
Alert button
Mar 03, 2022
Grigoris Velegkas, Zhuoran Yang, Amin Karbasi

Viaarxiv icon