Picture for Emma Brunskill

Emma Brunskill

Stanford University

Sublinear Optimal Policy Value Estimation in Contextual Bandits

Add code
Dec 13, 2019
Figure 1 for Sublinear Optimal Policy Value Estimation in Contextual Bandits
Figure 2 for Sublinear Optimal Policy Value Estimation in Contextual Bandits
Viaarxiv icon

Missingness as Stability: Understanding the Structure of Missingness in Longitudinal EHR data and its Impact on Reinforcement Learning in Healthcare

Add code
Nov 16, 2019
Figure 1 for Missingness as Stability: Understanding the Structure of Missingness in Longitudinal EHR data and its Impact on Reinforcement Learning in Healthcare
Figure 2 for Missingness as Stability: Understanding the Structure of Missingness in Longitudinal EHR data and its Impact on Reinforcement Learning in Healthcare
Figure 3 for Missingness as Stability: Understanding the Structure of Missingness in Longitudinal EHR data and its Impact on Reinforcement Learning in Healthcare
Viaarxiv icon

Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy

Add code
Nov 05, 2019
Figure 1 for Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy
Figure 2 for Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy
Figure 3 for Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy
Figure 4 for Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy
Viaarxiv icon

Problem Dependent Reinforcement Learning Bounds Which Can Identify Bandit Structure in MDPs

Add code
Nov 03, 2019
Viaarxiv icon

Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling

Add code
Oct 15, 2019
Figure 1 for Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling
Figure 2 for Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling
Figure 3 for Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling
Viaarxiv icon

Directed Exploration for Reinforcement Learning

Add code
Jun 18, 2019
Figure 1 for Directed Exploration for Reinforcement Learning
Figure 2 for Directed Exploration for Reinforcement Learning
Figure 3 for Directed Exploration for Reinforcement Learning
Figure 4 for Directed Exploration for Reinforcement Learning
Viaarxiv icon

Learning When-to-Treat Policies

Add code
May 23, 2019
Figure 1 for Learning When-to-Treat Policies
Figure 2 for Learning When-to-Treat Policies
Figure 3 for Learning When-to-Treat Policies
Figure 4 for Learning When-to-Treat Policies
Viaarxiv icon

Combining Parametric and Nonparametric Models for Off-Policy Evaluation

Add code
May 16, 2019
Figure 1 for Combining Parametric and Nonparametric Models for Off-Policy Evaluation
Figure 2 for Combining Parametric and Nonparametric Models for Off-Policy Evaluation
Figure 3 for Combining Parametric and Nonparametric Models for Off-Policy Evaluation
Figure 4 for Combining Parametric and Nonparametric Models for Off-Policy Evaluation
Viaarxiv icon

PLOTS: Procedure Learning from Observations using Subtask Structure

Add code
Apr 17, 2019
Figure 1 for PLOTS: Procedure Learning from Observations using Subtask Structure
Figure 2 for PLOTS: Procedure Learning from Observations using Subtask Structure
Figure 3 for PLOTS: Procedure Learning from Observations using Subtask Structure
Figure 4 for PLOTS: Procedure Learning from Observations using Subtask Structure
Viaarxiv icon

Off-Policy Policy Gradient with State Distribution Correction

Add code
Apr 17, 2019
Figure 1 for Off-Policy Policy Gradient with State Distribution Correction
Figure 2 for Off-Policy Policy Gradient with State Distribution Correction
Figure 3 for Off-Policy Policy Gradient with State Distribution Correction
Figure 4 for Off-Policy Policy Gradient with State Distribution Correction
Viaarxiv icon