Picture for Benjamin Van Roy

Benjamin Van Roy

Stanford University Department of Electrical Engineering

Langevin DQN

Add code
Feb 17, 2020
Figure 1 for Langevin DQN
Figure 2 for Langevin DQN
Figure 3 for Langevin DQN
Figure 4 for Langevin DQN
Viaarxiv icon

Provably Efficient Reinforcement Learning with Aggregated States

Add code
Dec 13, 2019
Viaarxiv icon

Information-Theoretic Confidence Bounds for Reinforcement Learning

Add code
Nov 21, 2019
Figure 1 for Information-Theoretic Confidence Bounds for Reinforcement Learning
Viaarxiv icon

Comments on the Du-Kakade-Wang-Yang Lower Bounds

Add code
Nov 18, 2019
Figure 1 for Comments on the Du-Kakade-Wang-Yang Lower Bounds
Viaarxiv icon

Behaviour Suite for Reinforcement Learning

Add code
Aug 13, 2019
Figure 1 for Behaviour Suite for Reinforcement Learning
Figure 2 for Behaviour Suite for Reinforcement Learning
Figure 3 for Behaviour Suite for Reinforcement Learning
Figure 4 for Behaviour Suite for Reinforcement Learning
Viaarxiv icon

On the Performance of Thompson Sampling on Logistic Bandits

Add code
May 12, 2019
Figure 1 for On the Performance of Thompson Sampling on Logistic Bandits
Viaarxiv icon

An Information-Theoretic Analysis for Thompson Sampling with Many Actions

Add code
Oct 01, 2018
Figure 1 for An Information-Theoretic Analysis for Thompson Sampling with Many Actions
Viaarxiv icon

Deep Exploration via Randomized Value Functions

Add code
Jun 06, 2018
Figure 1 for Deep Exploration via Randomized Value Functions
Figure 2 for Deep Exploration via Randomized Value Functions
Figure 3 for Deep Exploration via Randomized Value Functions
Figure 4 for Deep Exploration via Randomized Value Functions
Viaarxiv icon

Scalable Coordinated Exploration in Concurrent Reinforcement Learning

Add code
May 23, 2018
Figure 1 for Scalable Coordinated Exploration in Concurrent Reinforcement Learning
Figure 2 for Scalable Coordinated Exploration in Concurrent Reinforcement Learning
Figure 3 for Scalable Coordinated Exploration in Concurrent Reinforcement Learning
Viaarxiv icon

Satisficing in Time-Sensitive Bandit Learning

Add code
Mar 07, 2018
Figure 1 for Satisficing in Time-Sensitive Bandit Learning
Viaarxiv icon