Alert button

Multi-Task Off-Policy Learning from Bandit Feedback

Dec 09, 2022
Joey Hong, Branislav Kveton, Sumeet Katariya, Manzil Zaheer, Mohammad Ghavamzadeh

Figure 1 for Multi-Task Off-Policy Learning from Bandit Feedback
Figure 2 for Multi-Task Off-Policy Learning from Bandit Feedback
Figure 3 for Multi-Task Off-Policy Learning from Bandit Feedback

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: