Off-Policy Evaluation and Learning from Logged Bandit Feedback: Error Reduction via Surrogate Policy

Add code
Aug 01, 2018
Figure 1 for Off-Policy Evaluation and Learning from Logged Bandit Feedback: Error Reduction via Surrogate Policy
Figure 2 for Off-Policy Evaluation and Learning from Logged Bandit Feedback: Error Reduction via Surrogate Policy

Share this with someone who'll enjoy it:

View paper onarxiv iconopen_review iconOpenReview

Share this with someone who'll enjoy it: