Alert button
Picture for Canzhe Zhao

Canzhe Zhao

Alert button

Learning Adversarial Low-rank Markov Decision Processes with Unknown Transition and Full-information Feedback

Add code
Bookmark button
Alert button
Nov 14, 2023
Canzhe Zhao, Ruofeng Yang, Baoxiang Wang, Xuezhou Zhang, Shuai Li

Viaarxiv icon

DPMAC: Differentially Private Communication for Cooperative Multi-Agent Reinforcement Learning

Add code
Bookmark button
Alert button
Aug 19, 2023
Canzhe Zhao, Yanjie Ze, Jing Dong, Baoxiang Wang, Shuai Li

Figure 1 for DPMAC: Differentially Private Communication for Cooperative Multi-Agent Reinforcement Learning
Figure 2 for DPMAC: Differentially Private Communication for Cooperative Multi-Agent Reinforcement Learning
Figure 3 for DPMAC: Differentially Private Communication for Cooperative Multi-Agent Reinforcement Learning
Figure 4 for DPMAC: Differentially Private Communication for Cooperative Multi-Agent Reinforcement Learning
Viaarxiv icon

Best-of-three-worlds Analysis for Linear Bandits with Follow-the-regularized-leader Algorithm

Add code
Bookmark button
Alert button
Mar 13, 2023
Fang Kong, Canzhe Zhao, Shuai Li

Figure 1 for Best-of-three-worlds Analysis for Linear Bandits with Follow-the-regularized-leader Algorithm
Viaarxiv icon

Comparison-based Conversational Recommender System with Relative Bandit Feedback

Add code
Bookmark button
Alert button
Aug 21, 2022
Zhihui Xie, Tong Yu, Canzhe Zhao, Shuai Li

Figure 1 for Comparison-based Conversational Recommender System with Relative Bandit Feedback
Figure 2 for Comparison-based Conversational Recommender System with Relative Bandit Feedback
Figure 3 for Comparison-based Conversational Recommender System with Relative Bandit Feedback
Figure 4 for Comparison-based Conversational Recommender System with Relative Bandit Feedback
Viaarxiv icon

Simultaneously Learning Stochastic and Adversarial Bandits under the Position-Based Model

Add code
Bookmark button
Alert button
Jul 12, 2022
Cheng Chen, Canzhe Zhao, Shuai Li

Figure 1 for Simultaneously Learning Stochastic and Adversarial Bandits under the Position-Based Model
Figure 2 for Simultaneously Learning Stochastic and Adversarial Bandits under the Position-Based Model
Figure 3 for Simultaneously Learning Stochastic and Adversarial Bandits under the Position-Based Model
Viaarxiv icon

Differentially Private Temporal Difference Learning with Stochastic Nonconvex-Strongly-Concave Optimization

Add code
Bookmark button
Alert button
Jan 25, 2022
Canzhe Zhao, Yanjie Ze, Jing Dong, Baoxiang Wang, Shuai Li

Figure 1 for Differentially Private Temporal Difference Learning with Stochastic Nonconvex-Strongly-Concave Optimization
Figure 2 for Differentially Private Temporal Difference Learning with Stochastic Nonconvex-Strongly-Concave Optimization
Viaarxiv icon

Conservative Contextual Combinatorial Cascading Bandit

Add code
Bookmark button
Alert button
Apr 23, 2021
Kun Wang, Canzhe Zhao, Shuai Li, Shuo Shao

Figure 1 for Conservative Contextual Combinatorial Cascading Bandit
Figure 2 for Conservative Contextual Combinatorial Cascading Bandit
Figure 3 for Conservative Contextual Combinatorial Cascading Bandit
Viaarxiv icon