Alert button
Picture for Philip S. Thomas

Philip S. Thomas

Alert button

SOPE: Spectrum of Off-Policy Estimators

Add code
Bookmark button
Alert button
Dec 02, 2021
Christina J. Yuan, Yash Chandak, Stephen Giguere, Philip S. Thomas, Scott Niekum

Figure 1 for SOPE: Spectrum of Off-Policy Estimators
Figure 2 for SOPE: Spectrum of Off-Policy Estimators
Figure 3 for SOPE: Spectrum of Off-Policy Estimators
Figure 4 for SOPE: Spectrum of Off-Policy Estimators
Viaarxiv icon

Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs

Add code
Bookmark button
Alert button
May 31, 2021
Harsh Satija, Philip S. Thomas, Joelle Pineau, Romain Laroche

Figure 1 for Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs
Figure 2 for Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs
Figure 3 for Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs
Figure 4 for Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs
Viaarxiv icon

Universal Off-Policy Evaluation

Add code
Bookmark button
Alert button
Apr 26, 2021
Yash Chandak, Scott Niekum, Bruno Castro da Silva, Erik Learned-Miller, Emma Brunskill, Philip S. Thomas

Figure 1 for Universal Off-Policy Evaluation
Figure 2 for Universal Off-Policy Evaluation
Figure 3 for Universal Off-Policy Evaluation
Figure 4 for Universal Off-Policy Evaluation
Viaarxiv icon

High-Confidence Off-Policy (or Counterfactual) Variance Estimation

Add code
Bookmark button
Alert button
Jan 25, 2021
Yash Chandak, Shiv Shankar, Philip S. Thomas

Figure 1 for High-Confidence Off-Policy (or Counterfactual) Variance Estimation
Figure 2 for High-Confidence Off-Policy (or Counterfactual) Variance Estimation
Figure 3 for High-Confidence Off-Policy (or Counterfactual) Variance Estimation
Figure 4 for High-Confidence Off-Policy (or Counterfactual) Variance Estimation
Viaarxiv icon

Towards Safe Policy Improvement for Non-Stationary MDPs

Add code
Bookmark button
Alert button
Oct 23, 2020
Yash Chandak, Scott M. Jordan, Georgios Theocharous, Martha White, Philip S. Thomas

Figure 1 for Towards Safe Policy Improvement for Non-Stationary MDPs
Figure 2 for Towards Safe Policy Improvement for Non-Stationary MDPs
Figure 3 for Towards Safe Policy Improvement for Non-Stationary MDPs
Figure 4 for Towards Safe Policy Improvement for Non-Stationary MDPs
Viaarxiv icon

Reinforcement Learning for Strategic Recommendations

Add code
Bookmark button
Alert button
Sep 15, 2020
Georgios Theocharous, Yash Chandak, Philip S. Thomas, Frits de Nijs

Figure 1 for Reinforcement Learning for Strategic Recommendations
Figure 2 for Reinforcement Learning for Strategic Recommendations
Figure 3 for Reinforcement Learning for Strategic Recommendations
Figure 4 for Reinforcement Learning for Strategic Recommendations
Viaarxiv icon

Evaluating the Performance of Reinforcement Learning Algorithms

Add code
Bookmark button
Alert button
Jun 30, 2020
Scott M. Jordan, Yash Chandak, Daniel Cohen, Mengxue Zhang, Philip S. Thomas

Figure 1 for Evaluating the Performance of Reinforcement Learning Algorithms
Figure 2 for Evaluating the Performance of Reinforcement Learning Algorithms
Figure 3 for Evaluating the Performance of Reinforcement Learning Algorithms
Figure 4 for Evaluating the Performance of Reinforcement Learning Algorithms
Viaarxiv icon

Optimizing for the Future in Non-Stationary MDPs

Add code
Bookmark button
Alert button
Jun 02, 2020
Yash Chandak, Georgios Theocharous, Shiv Shankar, Martha White, Sridhar Mahadevan, Philip S. Thomas

Figure 1 for Optimizing for the Future in Non-Stationary MDPs
Figure 2 for Optimizing for the Future in Non-Stationary MDPs
Figure 3 for Optimizing for the Future in Non-Stationary MDPs
Figure 4 for Optimizing for the Future in Non-Stationary MDPs
Viaarxiv icon