Alert button
Picture for Zhihan Xiong

Zhihan Xiong

Alert button

A/B Testing and Best-arm Identification for Linear Bandits with Robustness to Non-stationarity

Add code
Bookmark button
Alert button
Jul 27, 2023
Zhihan Xiong, Romain Camilleri, Maryam Fazel, Lalit Jain, Kevin Jamieson

Figure 1 for A/B Testing and Best-arm Identification for Linear Bandits with Robustness to Non-stationarity
Figure 2 for A/B Testing and Best-arm Identification for Linear Bandits with Robustness to Non-stationarity
Figure 3 for A/B Testing and Best-arm Identification for Linear Bandits with Robustness to Non-stationarity
Figure 4 for A/B Testing and Best-arm Identification for Linear Bandits with Robustness to Non-stationarity
Viaarxiv icon

A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning

Add code
Bookmark button
Alert button
Jun 12, 2023
Haozhe Jiang, Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon S. Du

Figure 1 for A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning
Figure 2 for A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning
Figure 3 for A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning
Figure 4 for A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning
Viaarxiv icon

Offline congestion games: How feedback type affects data coverage requirement

Add code
Bookmark button
Alert button
Oct 24, 2022
Haozhe Jiang, Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon S. Du

Figure 1 for Offline congestion games: How feedback type affects data coverage requirement
Figure 2 for Offline congestion games: How feedback type affects data coverage requirement
Figure 3 for Offline congestion games: How feedback type affects data coverage requirement
Figure 4 for Offline congestion games: How feedback type affects data coverage requirement
Viaarxiv icon

Learning in Congestion Games with Bandit Feedback

Add code
Bookmark button
Alert button
Jun 04, 2022
Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon S. Du

Figure 1 for Learning in Congestion Games with Bandit Feedback
Viaarxiv icon

Selective Sampling for Online Best-arm Identification

Add code
Bookmark button
Alert button
Nov 02, 2021
Romain Camilleri, Zhihan Xiong, Maryam Fazel, Lalit Jain, Kevin Jamieson

Figure 1 for Selective Sampling for Online Best-arm Identification
Figure 2 for Selective Sampling for Online Best-arm Identification
Viaarxiv icon

Randomized Exploration is Near-Optimal for Tabular MDP

Add code
Bookmark button
Alert button
Feb 19, 2021
Zhihan Xiong, Ruoqi Shen, Simon S. Du

Figure 1 for Randomized Exploration is Near-Optimal for Tabular MDP
Figure 2 for Randomized Exploration is Near-Optimal for Tabular MDP
Figure 3 for Randomized Exploration is Near-Optimal for Tabular MDP
Viaarxiv icon

Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning

Add code
Bookmark button
Alert button
Dec 23, 2019
Tian Tan, Zhihan Xiong, Vikranth R. Dwaracherla

Figure 1 for Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning
Figure 2 for Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning
Figure 3 for Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning
Figure 4 for Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning
Viaarxiv icon