Alert button
Picture for Philip S. Thomas

Philip S. Thomas

Alert button

From Past to Future: Rethinking Eligibility Traces

Dec 20, 2023
Dhawal Gupta, Scott M. Jordan, Shreyas Chaudhari, Bo Liu, Philip S. Thomas, Bruno Castro da Silva

Viaarxiv icon

Behavior Alignment via Reward Function Optimization

Oct 31, 2023
Dhawal Gupta, Yash Chandak, Scott M. Jordan, Philip S. Thomas, Bruno Castro da Silva

Viaarxiv icon

Learning Fair Representations with High-Confidence Guarantees

Oct 23, 2023
Yuhong Luo, Austin Hoag, Philip S. Thomas

Viaarxiv icon

Coagent Networks: Generalized and Scaled

May 16, 2023
James E. Kostas, Scott M. Jordan, Yash Chandak, Georgios Theocharous, Dhawal Gupta, Martha White, Bruno Castro da Silva, Philip S. Thomas

Figure 1 for Coagent Networks: Generalized and Scaled
Figure 2 for Coagent Networks: Generalized and Scaled
Figure 3 for Coagent Networks: Generalized and Scaled
Figure 4 for Coagent Networks: Generalized and Scaled
Viaarxiv icon

Optimization using Parallel Gradient Evaluations on Multiple Parameters

Feb 06, 2023
Yash Chandak, Shiv Shankar, Venkata Gandikota, Philip S. Thomas, Arya Mazumdar

Figure 1 for Optimization using Parallel Gradient Evaluations on Multiple Parameters
Figure 2 for Optimization using Parallel Gradient Evaluations on Multiple Parameters
Figure 3 for Optimization using Parallel Gradient Evaluations on Multiple Parameters
Viaarxiv icon

Off-Policy Evaluation for Action-Dependent Non-Stationary Environments

Jan 24, 2023
Yash Chandak, Shiv Shankar, Nathaniel D. Bastian, Bruno Castro da Silva, Emma Brunskil, Philip S. Thomas

Figure 1 for Off-Policy Evaluation for Action-Dependent Non-Stationary Environments
Figure 2 for Off-Policy Evaluation for Action-Dependent Non-Stationary Environments
Figure 3 for Off-Policy Evaluation for Action-Dependent Non-Stationary Environments
Figure 4 for Off-Policy Evaluation for Action-Dependent Non-Stationary Environments
Viaarxiv icon

Enforcing Delayed-Impact Fairness Guarantees

Aug 24, 2022
Aline Weber, Blossom Metevier, Yuriy Brun, Philip S. Thomas, Bruno Castro da Silva

Figure 1 for Enforcing Delayed-Impact Fairness Guarantees
Figure 2 for Enforcing Delayed-Impact Fairness Guarantees
Figure 3 for Enforcing Delayed-Impact Fairness Guarantees
Figure 4 for Enforcing Delayed-Impact Fairness Guarantees
Viaarxiv icon

Adaptive Rollout Length for Model-Based RL Using Model-Free Deep RL

Jun 07, 2022
Abhinav Bhatia, Philip S. Thomas, Shlomo Zilberstein

Figure 1 for Adaptive Rollout Length for Model-Based RL Using Model-Free Deep RL
Figure 2 for Adaptive Rollout Length for Model-Based RL Using Model-Free Deep RL
Viaarxiv icon

Edge-Compatible Reinforcement Learning for Recommendations

Dec 10, 2021
James E. Kostas, Philip S. Thomas, Georgios Theocharous

Figure 1 for Edge-Compatible Reinforcement Learning for Recommendations
Figure 2 for Edge-Compatible Reinforcement Learning for Recommendations
Figure 3 for Edge-Compatible Reinforcement Learning for Recommendations
Figure 4 for Edge-Compatible Reinforcement Learning for Recommendations
Viaarxiv icon