Alert button

Optimizing the Long-Term Average Reward for Continuing MDPs: A Technical Report

Apr 13, 2021
Chao Xu, Yiping Xie, Xijun Wang, Howard H. Yang, Dusit Niyato, Tony Q. S. Quek

Figure 1 for Optimizing the Long-Term Average Reward for Continuing MDPs: A Technical Report

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: