Picture for Philip S. Thomas

Philip S. Thomas

Optimization using Parallel Gradient Evaluations on Multiple Parameters

Add code
Feb 06, 2023
Viaarxiv icon

Off-Policy Evaluation for Action-Dependent Non-Stationary Environments

Add code
Jan 24, 2023
Viaarxiv icon

Enforcing Delayed-Impact Fairness Guarantees

Add code
Aug 24, 2022
Figure 1 for Enforcing Delayed-Impact Fairness Guarantees
Figure 2 for Enforcing Delayed-Impact Fairness Guarantees
Figure 3 for Enforcing Delayed-Impact Fairness Guarantees
Figure 4 for Enforcing Delayed-Impact Fairness Guarantees
Viaarxiv icon

Adaptive Rollout Length for Model-Based RL Using Model-Free Deep RL

Add code
Jun 07, 2022
Figure 1 for Adaptive Rollout Length for Model-Based RL Using Model-Free Deep RL
Figure 2 for Adaptive Rollout Length for Model-Based RL Using Model-Free Deep RL
Viaarxiv icon

Edge-Compatible Reinforcement Learning for Recommendations

Add code
Dec 10, 2021
Figure 1 for Edge-Compatible Reinforcement Learning for Recommendations
Figure 2 for Edge-Compatible Reinforcement Learning for Recommendations
Figure 3 for Edge-Compatible Reinforcement Learning for Recommendations
Figure 4 for Edge-Compatible Reinforcement Learning for Recommendations
Viaarxiv icon

SOPE: Spectrum of Off-Policy Estimators

Add code
Dec 02, 2021
Figure 1 for SOPE: Spectrum of Off-Policy Estimators
Figure 2 for SOPE: Spectrum of Off-Policy Estimators
Figure 3 for SOPE: Spectrum of Off-Policy Estimators
Figure 4 for SOPE: Spectrum of Off-Policy Estimators
Viaarxiv icon

Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs

Add code
May 31, 2021
Figure 1 for Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs
Figure 2 for Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs
Figure 3 for Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs
Figure 4 for Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs
Viaarxiv icon

Universal Off-Policy Evaluation

Add code
Apr 26, 2021
Figure 1 for Universal Off-Policy Evaluation
Figure 2 for Universal Off-Policy Evaluation
Figure 3 for Universal Off-Policy Evaluation
Figure 4 for Universal Off-Policy Evaluation
Viaarxiv icon

High-Confidence Off-Policy (or Counterfactual) Variance Estimation

Add code
Jan 25, 2021
Figure 1 for High-Confidence Off-Policy (or Counterfactual) Variance Estimation
Figure 2 for High-Confidence Off-Policy (or Counterfactual) Variance Estimation
Figure 3 for High-Confidence Off-Policy (or Counterfactual) Variance Estimation
Figure 4 for High-Confidence Off-Policy (or Counterfactual) Variance Estimation
Viaarxiv icon

Towards Safe Policy Improvement for Non-Stationary MDPs

Add code
Oct 23, 2020
Figure 1 for Towards Safe Policy Improvement for Non-Stationary MDPs
Figure 2 for Towards Safe Policy Improvement for Non-Stationary MDPs
Figure 3 for Towards Safe Policy Improvement for Non-Stationary MDPs
Figure 4 for Towards Safe Policy Improvement for Non-Stationary MDPs
Viaarxiv icon