Picture for Zhihan Xiong

Zhihan Xiong

A/B Testing and Best-arm Identification for Linear Bandits with Robustness to Non-stationarity

Add code
Jul 27, 2023
Figure 1 for A/B Testing and Best-arm Identification for Linear Bandits with Robustness to Non-stationarity
Figure 2 for A/B Testing and Best-arm Identification for Linear Bandits with Robustness to Non-stationarity
Figure 3 for A/B Testing and Best-arm Identification for Linear Bandits with Robustness to Non-stationarity
Figure 4 for A/B Testing and Best-arm Identification for Linear Bandits with Robustness to Non-stationarity
Viaarxiv icon

A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning

Add code
Jun 12, 2023
Figure 1 for A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning
Figure 2 for A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning
Figure 3 for A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning
Figure 4 for A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning
Viaarxiv icon

Offline congestion games: How feedback type affects data coverage requirement

Add code
Oct 24, 2022
Figure 1 for Offline congestion games: How feedback type affects data coverage requirement
Figure 2 for Offline congestion games: How feedback type affects data coverage requirement
Figure 3 for Offline congestion games: How feedback type affects data coverage requirement
Figure 4 for Offline congestion games: How feedback type affects data coverage requirement
Viaarxiv icon

Learning in Congestion Games with Bandit Feedback

Add code
Jun 04, 2022
Figure 1 for Learning in Congestion Games with Bandit Feedback
Viaarxiv icon

Selective Sampling for Online Best-arm Identification

Add code
Nov 02, 2021
Figure 1 for Selective Sampling for Online Best-arm Identification
Figure 2 for Selective Sampling for Online Best-arm Identification
Viaarxiv icon

Randomized Exploration is Near-Optimal for Tabular MDP

Add code
Feb 19, 2021
Figure 1 for Randomized Exploration is Near-Optimal for Tabular MDP
Figure 2 for Randomized Exploration is Near-Optimal for Tabular MDP
Figure 3 for Randomized Exploration is Near-Optimal for Tabular MDP
Viaarxiv icon

Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning

Add code
Dec 23, 2019
Figure 1 for Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning
Figure 2 for Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning
Figure 3 for Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning
Figure 4 for Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning
Viaarxiv icon