Alert button

Convergence of a Human-in-the-Loop Policy-Gradient Algorithm With Eligibility Trace Under Reward, Policy, and Advantage Feedback

Sep 15, 2021
Ishaan Shah, David Halpern, Kavosh Asadi, Michael L. Littman

Figure 1 for Convergence of a Human-in-the-Loop Policy-Gradient Algorithm With Eligibility Trace Under Reward, Policy, and Advantage Feedback
Figure 2 for Convergence of a Human-in-the-Loop Policy-Gradient Algorithm With Eligibility Trace Under Reward, Policy, and Advantage Feedback

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: