Picture for Yasin Abbasi-Yadkori

Yasin Abbasi-Yadkori

Context-lumpable stochastic bandits

Add code
Jun 22, 2023
Viaarxiv icon

Non-stationary Bandits and Meta-Learning with a Small Set of Optimal Arms

Add code
Mar 13, 2022
Figure 1 for Non-stationary Bandits and Meta-Learning with a Small Set of Optimal Arms
Figure 2 for Non-stationary Bandits and Meta-Learning with a Small Set of Optimal Arms
Figure 3 for Non-stationary Bandits and Meta-Learning with a Small Set of Optimal Arms
Figure 4 for Non-stationary Bandits and Meta-Learning with a Small Set of Optimal Arms
Viaarxiv icon

A New Look at Dynamic Regret for Non-Stationary Stochastic Bandits

Add code
Jan 17, 2022
Viaarxiv icon

Efficient Local Planning with Linear Function Approximation

Add code
Aug 12, 2021
Figure 1 for Efficient Local Planning with Linear Function Approximation
Viaarxiv icon

Parameter and Feature Selection in Stochastic Linear Bandits

Add code
Jun 09, 2021
Figure 1 for Parameter and Feature Selection in Stochastic Linear Bandits
Viaarxiv icon

Improved Regret Bound and Experience Replay in Regularized Policy Iteration

Add code
Feb 25, 2021
Figure 1 for Improved Regret Bound and Experience Replay in Regularized Policy Iteration
Viaarxiv icon

Optimization Issues in KL-Constrained Approximate Policy Iteration

Add code
Feb 11, 2021
Figure 1 for Optimization Issues in KL-Constrained Approximate Policy Iteration
Figure 2 for Optimization Issues in KL-Constrained Approximate Policy Iteration
Figure 3 for Optimization Issues in KL-Constrained Approximate Policy Iteration
Figure 4 for Optimization Issues in KL-Constrained Approximate Policy Iteration
Viaarxiv icon

On Query-efficient Planning in MDPs under Linear Realizability of the Optimal State-value Function

Add code
Feb 04, 2021
Viaarxiv icon

The Elliptical Potential Lemma Revisited

Add code
Oct 20, 2020
Viaarxiv icon

Regret Balancing for Bandit and RL Model Selection

Add code
Jun 09, 2020
Figure 1 for Regret Balancing for Bandit and RL Model Selection
Figure 2 for Regret Balancing for Bandit and RL Model Selection
Figure 3 for Regret Balancing for Bandit and RL Model Selection
Viaarxiv icon