Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

Enforcing Delayed-Impact Fairness Guarantees


Aug 24, 2022
Aline Weber, Blossom Metevier, Yuriy Brun, Philip S. Thomas, Bruno Castro da Silva

* 24 pages, 5 figures 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Adaptive Rollout Length for Model-Based RL Using Model-Free Deep RL


Jun 07, 2022
Abhinav Bhatia, Philip S. Thomas, Shlomo Zilberstein


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Edge-Compatible Reinforcement Learning for Recommendations


Dec 10, 2021
James E. Kostas, Philip S. Thomas, Georgios Theocharous


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

SOPE: Spectrum of Off-Policy Estimators


Dec 02, 2021
Christina J. Yuan, Yash Chandak, Stephen Giguere, Philip S. Thomas, Scott Niekum

* Accepted at Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021) 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs


May 31, 2021
Harsh Satija, Philip S. Thomas, Joelle Pineau, Romain Laroche


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Universal Off-Policy Evaluation


Apr 26, 2021
Yash Chandak, Scott Niekum, Bruno Castro da Silva, Erik Learned-Miller, Emma Brunskill, Philip S. Thomas


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

High-Confidence Off-Policy (or Counterfactual) Variance Estimation


Jan 25, 2021
Yash Chandak, Shiv Shankar, Philip S. Thomas

* Thirty-fifth AAAI Conference on Artificial Intelligence (AAAI 2021) 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Towards Safe Policy Improvement for Non-Stationary MDPs


Oct 23, 2020
Yash Chandak, Scott M. Jordan, Georgios Theocharous, Martha White, Philip S. Thomas

* Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS 2020) 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email
1
2
3
4
>>